DOI QR코드

DOI QR Code

Selecting Optimal Algorithms for Stroke Prediction: Machine Learning-Based Approach

  • Kyung Tae CHOI (Dept. of Medical IT, Eulji University) ;
  • Kyung-A KIM (Dept. of Medical Artificial Intelligence, Eulji University) ;
  • Myung-Ae CHUNG (Dept. of BigData Medical Convergence, Eulji University) ;
  • Min Soo KANG (Dept. of BigData Medical Convergence, Eulji University)
  • Received : 2024.02.14
  • Accepted : 2024.06.05
  • Published : 2024.06.30

Abstract

In this paper, we compare three models (logistic regression, Random Forest, and XGBoost) for predicting stroke occurrence using data from the Korea National Health and Nutrition Examination Survey (KNHANES). We evaluated these models using various metrics, focusing mainly on recall and F1 score to assess their performance. Initially, the logistic regression model showed a satisfactory recall score among the three models; however, it was excluded from further consideration because it did not meet the F1 score threshold, which was set at a minimum of 0.5. The F1 score is crucial as it considers both precision and recall, providing a balanced measure of a model's accuracy. Among the models that met the criteria, XGBoost showed the highest recall rate and showed excellent performance in stroke prediction. In particular, XGBoost shows strong performance not only in recall, but also in F1 score and AUC, so it should be considered the optimal algorithm for predicting stroke occurrence. This study determines that the performance of XGBoost is optimal in the field of stroke prediction.

Keywords

Acknowledgement

This work was supported by the research grant of the KODISA Scholarship Foundation in 2024.

References

  1. Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794). 
  2. Draelos, R. (2019, February 23). Measuring Performance: AUC (AUROC). glassbox. https://glassboxmedicine.com/2019/02/23/measuringperformance-auc-auroc/ 
  3. Korean Stoke Society. (2023). Stroke treatment guidelines
  4. Korea Disease Control and Prevention Agency. (2022). Raw data usage guidelines. 
  5. Na, J.(2020). Stroke: "Early detection and prevention are important". Biotime. Retrieved from: https://www.biotimes.co.kr/news/articleView.html?idxno=3915 
  6. Statistics Korea. (2022). Cause of death statistics