Comparison of Machine Learning Methodology in COPD Cohort Data

COPD 코호트 자료에서의 Machine Learning 방법론 비교

  • Received : 2017.11.27
  • Accepted : 2017.12.26
  • Published : 2017.12.31

Abstract

Recently, Machine Learning Methods are widely used with high prediction performance. But if the limit of the data is solved by the statistical technique, It can, lead to higher prediction performance than the existing one. In this study, the SMOTE method is used to solve the imbalance problem in the longitudinal and imbalanced data. As a result, It, was confirmed that the prediction performance increases. Additionally, Although, studies on COPD have been actively conducted, only studies that are related to acute exacerbation have been conducted. So there are no studies on the prediction of acute exacerbation through multiple perspectives and predictive models for various factors. In this study, We examined the factors related to acute exacerbation of COPD and constructed a personalized specific disease prediction model.

최근 머신러닝 방법은 높은 예측력과 함께 널리 이용되지만 머신러닝을 제대로 활용하기 위해서 데이터가 가진 한계를 통계적 기법으로 해결한다면 기존보다 더 높은 예측력을 이끌어 낼 수 있다. 본 연구에서는 Longitudinal and Imbalanced Data에서 SMOTE 방법을 활용하여 불균형 문제를 해결한 결과 예측력이 증가하는 것을 확인할 수 있었다. 추가적으로 만성폐쇄성폐질환 급성악화 관련 연구가 활발히 이루어지고 있지만 급성악화와 관련 있는 요인을 찾는 연구만 이루어지고 있어 여러 요인들에 대한 복합적인 관철과 예측모형을 통한 급성악화 예측 연구는 이루어지지 않는다. 본 연구에서는 여러 요인을 같이 살펴봤을 때 어떤 요인들이 만성폐쇄성폐질환 급성악화와 관련이 있는지 확인하고 개인 맞춤형 특정 질환 예측 모형을 구축하였다.

Keywords

References

  1. 유광하, 정기석, 김영삼, 박용범, 신경철, 윤형규, 이상엽, 이진국, 이진화, "전국적 COPD Cohort 연구 기초 자료(KOCOSS 연구 cohort)", 대한결핵 및 호흡기학회 추계학술발표 초록집, pp.196-196, 2012.
  2. 유지홍, "COPD 진료지침", 대한결핵 및 호흡기학회, 2014.
  3. 이범석, "반응 표면 방법을 이용한 딥러닝 매개 변수 최적화 연구", 인하대학교학위논문, 2017.
  4. Andersson, F., S. Borg, S.-A. Jansson, A.-C. Jonnson, A. Erincsson, C. Prutz, E. Ronmark, and B. Lundback, "The costs of exacerbations in chronic obstructive pulmonary disease (COPD)", Respiratory Medicine, Vol.96, No.9, pp.700-708, 2002. https://doi.org/10.1053/rmed.2002.1334
  5. Au, D.H., C.L. Bryson, J.W. Chine, H. Sun, E.M. Udris, L.E. Evans, and K.A. Bradley, "The Effects of Smoking Cessation on the Risk of Chronic Obstructive Pulmonary Disease Exacerbations", J Gen Intern Med, Vol.24, pp.457-463, 2009. https://doi.org/10.1007/s11606-009-0907-y
  6. Burge, S. and J.A. Wedzicha, "COPD exacerbations: definitions and classifications", Eur Respir J, Vol.21, No.41, pp.46s-53s, 2003. https://doi.org/10.1183/09031936.03.00078002
  7. Chawla, N.V., K.W. Bowyer, L.O. Hall, and W.P. Kegelmeyer, "SMOTE: Synthetic Minority Over- sampling Technique", Journal of Artificial Intelligence Research, Vol.16, pp.321-357, 2002. https://doi.org/10.1613/jair.953
  8. Donaldson, G.C., T.A.R. Seemungal, A. Bhowmik, and J.A. Wedzicha, "Relationship between exacerbations frequency and lung function decline in chronic obstructive pulmonary disease", Thorax, Vol.57, pp.847-852, 2002. https://doi.org/10.1136/thorax.57.10.847
  9. Gama, J. and G. Castillo, "Adaptive Bayes for User Modeling", EUNITE, 2002.
  10. Hinton, G., L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, "Deep Neural Networks for Acoustic Modeling in Speech Recognition", IEEE Signal Processing Magazine, Vol.29, No.6, pp.82-97, 2012. https://doi.org/10.1109/MSP.2012.2205597
  11. Hurst, J.R., "Susceptibility to Exacerbations in Chronic Obstructive Pulmonary Disease", The New England Journal of Medicine, Vol.363, No.12, 2010.
  12. Khuri, A.I. and S. Mukhopadhyay, "Response surface methodology", TOC, Vol.2, No.2, pp.128-149, 2010.
  13. Laird, N.M. and J.H. Ware, "Random-Effects Models for Longitudinal Data", Biometrics, Vol.38, pp.963-974, 1982. https://doi.org/10.2307/2529876
  14. Luts, J., G. Molenberghs, G. Verbeke, S. Van Huffel, and J.A.K. Suykens, "A mixed effects least suqres support vector machine models for classificatio of longitudinal data", Computational Statistics and Data Analysis, Vol.56, pp.611-628, 2012. https://doi.org/10.1016/j.csda.2011.09.008
  15. Nathalie, J. and S. Shaju, "The class imbalance problem: A systematic study", Intelligent Data Analysis, Vol.6, pp.429-449, 2002. https://doi.org/10.3233/IDA-2002-6504
  16. Seemungal, T., R. Happer-Owen, and A. Bhowmik, "Respiratory viruses, Symptoms, and Inflammatory Markers in Acute Exacerbations and Stable Chronic Obstructive Pulmonary Disease", Am J Respir Crit Care Med, Vol.164, No.9, pp.429-449, 2001.
  17. Terence, A.R. and A. Jadwiga, "Exacerbation frequency and FEV1 decline of COPD: is it geographic?", European Respiratory Journal, Vo.l43, pp.1220-1222, 2014. https://doi.org/10.1183/09031936.00046014
  18. Teresa, T., "Progression from Asthma to Chronic Obstructive Pulmonary Disease Is Air Pollution a Risk Factor?", AM J Respir Crit Care Med, Vol.194, No.4, pp.429-438, 2016. https://doi.org/10.1164/rccm.201510-1932OC
  19. Tseng, C.M., Y.T. Chen, S.M. Ou, Y.H. Hsiao, S.Y. Li, S.J. Wang, A.C. Yang, T. Chen, and D. Perg, "The Effect of Cold Temperature on Increased Exacerbation of Chronic Obstructive Pulmonary Disease: A Nationwide Study", PLOS ONE, Vol.8, No.3, pp.e57066, 2013. https://doi.org/10.1371/journal.pone.0057066
  20. Tu, Y.H., Y. Zhang and G. Fei, "Utility of the CAT in therapy assessment of COPD exacerbations in China", BMC Pulmonary Medicine, pp.14-42, 2014.
  21. Yoon, H.K., Y.B. Park, C.K. Rhee, J.H. Lee, and Y.M. Oh, "Summary of the Chronic Obstructive Pulmonary Disease Clinical Practive Guideline Revised in 2014", The Korean Academy of Tuberculosis and Respiratory Diseases, 2017.
  22. Zeger, S.L., K.Y. Lian, and P.S. Albert, "Models for Longitudinal Data: A Generalized Estimating Equation Approach", Biometircs, Vol.44, pp.1049-1060, 1988. https://doi.org/10.2307/2531734
  23. http://health.chosun.com/site/data/html_dir/2016/09/27/2016092702474.html.
  24. https://www.analyticsvidhya.com/blog/2017/03/imbalanced-classification-problem/.
  25. http://www.cdc.go.kr/CDC/main.jsp.
  26. http://www.kma.go.kr/index.jsp.
  27. https://www.airkorea.or.kr/index.