DOI QR코드

DOI QR Code

Prediction of Hypertension Complications Risk Using Classification Techniques

  • Lee, Wonji (Department of Industrial and Management Engineering, POSTECH) ;
  • Lee, Junghye (Department of Industrial and Management Engineering, POSTECH) ;
  • Lee, Hyeseon (Department of Industrial and Management Engineering, POSTECH) ;
  • Jun, Chi-Hyuck (Department of Industrial and Management Engineering, POSTECH) ;
  • Park, Il-Su (Department of Health, Uiduk University) ;
  • Kang, Sung-Hong (Department of Health Policy and Management, Inje University)
  • Received : 2014.11.07
  • Accepted : 2014.11.25
  • Published : 2014.12.30

Abstract

Chronic diseases including hypertension and its complications are major sources causing the national medical expenditures to increase. We aim to predict the risk of hypertension complications for hypertension patients, using the sample national healthcare database established by Korean National Health Insurance Corporation. We apply classification techniques, such as logistic regression, linear discriminant analysis, and classification and regression tree to predict the hypertension complication onset event for each patient. The performance of these three methods is compared in terms of accuracy, sensitivity and specificity. The result shows that these methods seem to perform similarly although the logistic regression performs marginally better than the others.

Keywords

References

  1. Breiman, L., Friedman, J., Olshen, R., Stone, C., Steinberg, D., and Colla, P. (1983), CART: Classification and Regression Trees, Wadsworth, Belmont, CA.
  2. Dreiseitl, S. and Ohno-Machado, L. (2002), Logistic regression and artificial neural network classification models: a methodology review, Journal of Biomedical Informatics, 35(5), 352-359. https://doi.org/10.1016/S1532-0464(03)00034-0
  3. Echouffo-Tcheugui, J. B., Batty, G. D., Kivimaki, M., and Kengne, A. P. (2013), Risk models to predict hypertension: a systematic review, PloS One, 8(7), e67370. https://doi.org/10.1371/journal.pone.0067370
  4. Hosmer Jr, D. W. and Lemeshow, S. (2004), Applied Logistic Regression, John Wiley and Sons, Hoboken, NJ.
  5. Hozawa, A., Kuriyama, S., Kakizaki, M., Ohmori-Matsuda, K., Ohkubo, T., and Tsuji, I. (2009), Attributable risk fraction of prehypertension on cardiovascular disease mortality in the Japanese population: the Ohsaki study, American Journal of Hypertension, 22(3), 267-272. https://doi.org/10.1038/ajh.2008.335
  6. Izenman, A. J. (2008), Linear discriminant analysis. In: Modern Multivariate Statistical Techniques, Springer, Heidelberg, 237-280.
  7. Korea Ministry of Health and Welfare (2011), 2010 Korean Health Statistics Report, Korea Ministry of Health and Welfare, Seoul.
  8. Korea National Health Insurance Corporation (2012), 2011 Health Insurance Statistics Annual Report, Korea National Health Insurance Corporation, Seoul.
  9. Kurt, I., Ture, M., and Kurum, A. T. (2008), Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease, Expert Systems with Applications, 34(1), 366-374. https://doi.org/10.1016/j.eswa.2006.09.004
  10. Loh, W. Y. (2011), Classification and regression trees, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 14-23. https://doi.org/10.1002/widm.8
  11. Park, Y. I. and Jun, M. H. (2000), The effect of a self-regulation program for hypertensives in rural areas, Journal of Korean Academy of Nursing, 30(5), 1303-1317. https://doi.org/10.4040/jkan.2000.30.5.1303
  12. Press, S. J. and Wilson, S. (1978), Choosing between logistic regression and discriminant analysis, Journal of the American Statistical Association, 73(364), 699-705. https://doi.org/10.1080/01621459.1978.10480080
  13. Srinivas, K., Rao, G. R., and Govardhan, A. (2010), Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques, Proceedings of the 5th International Conference on Computer Science and Education (ICCSE2010), Hefei, China, 2010, 1344-1349.
  14. Zhu, W., Zeng, N., and Wang, N. (2010), Sensitivity, specificity, accuracy, associated confidence interval and ROC analysis with practical SAS(R) implementations, Proceedings of the 23rd Annual Conference on Northeast SAS Users Group (NESUG): Health Care and Life Sciences, Baltimore, MD.

Cited by

  1. Discovering Relationships between Skin Type and Life Style Using Data Mining Techniques: A Case Study of Korea vol.15, pp.1, 2016, https://doi.org/10.7232/iems.2016.15.1.110
  2. A Machine-Learning-Based Prediction Method for Hypertension Outcomes Based on Medical Data vol.9, pp.4, 2019, https://doi.org/10.3390/diagnostics9040178
  3. Prediction of Hypertension Outcomes Based on Gain Sequence Forward Tabu Search Feature Selection and XGBoost vol.11, pp.5, 2021, https://doi.org/10.3390/diagnostics11050792