Document Type


Publication Date



Introduction: Strokes are one of the leading causes of morbidity and mortality in the world and its cost of management has vastly increased; an effective prediction tool that utilizes artificial intelligence to lower the rate of stroke-related readmissions has the potential to lower healthcare costs and increase the quality of provider care. We hypothesize that machine learning techniques are superior to traditional statistics when determining the likelihood of 30-day readmission for Jefferson’s stroke patients.

Methods: Jefferson’s existing data on stroke patients were cleaned, aggregated, and prepared to be split into train and test sets. Using the train sets, machine learning (ML) models such as Random Forest, Support Vector Machines, and Neural Networks were trained to assess the risk of readmission. Each model’s accuracy and precision were captured in the form of confusion matrices, AUCs, and more to reveal the most superior ML method in assessing this risk. These results were then compared to the readmission risk determined by traditional statistics.

Results: After training the ML models, the test sets were inputted to determine how accurately they could predict a stroke patient’s chance of readmission with new data. Traditional statistics (in the form of logistic regression) showed an accuracy of 84%. The ML methods utilized resulted in the following accuracies: Random Forest at 95.50%, SVM at 94.79%, and Neural Networks at 95.40%.

Discussion: This study not only demonstrates that machine learning methods are superior to traditional statistics in regard to determining the 30-day readmission risks for Jefferson stroke patients, but it also shows that the Random Forest model is the most accurate in doing so. The potential implications of this tool are large; its use can be seen at both the patient and the hospital levels by improving costs for the patient and the hospital as well as improving stroke education and care.