DOI QR코드

DOI QR Code

Comparison of nomogram construction methods using chronic obstructive pulmonary disease

만성 폐쇄성 폐질환을 이용한 노모그램 구축과 비교

  • Received : 2018.02.07
  • Accepted : 2018.03.28
  • Published : 2018.06.30

Abstract

Nomogram is a statistical tool that visualizes the risk factors of the disease and then helps to understand the untrained people. This study used risk factors of chronic obstructive pulmonary disease (COPD) and compared with logistic regression model and naïve Bayesian classifier model. Data were analyzed using the Korean National Health and Nutrition Examination Survey 6th (2013-2015). First, we used 6 risk factors about COPD. We constructed nomogram using logistic regression model and naïve Bayesian classifier model. We also compared the nomograms constructed using the two methods to find out which method is more appropriate. The receiver operating characteristic curve and the calibration plot were used to verify each nomograms.

노모그램은 질병의 위험 요인과 예측 확률을 쉽게 이해할 수 있도록 시각적으로 표현하는 통계적 도구이다. 본 논문은 만성 폐쇄성 폐질환(chronic obstructive pulmonary disease)의 위험 요인을 이용하여 로지스틱 회귀모형과 순수 베이지안 분류기 모형의 노모그램을 구축하고 이를 비교하였다. 분석 데이터는 국민건강영양조사 6기(2013-2015)를 이용하여 진행하였다. 총 6개의 위험 요인을 이용하였다. 그리고 로지스틱 회귀모형, 순수 베이지안 분류기 모형과 각각의 구축 방법을 이용하여 만성 폐쇄성 폐질환의 노모그램을 제시하였다. 또한, 구축된 두 노모그램을 비교하여 유용성을 살펴보았다. 마지막으로 ROC curve와 Calibration plot을 통하여 각 노모그램을 검증하였다.

Keywords

References

  1. Bellazzi, R. and Zupan, B. (2008). Predictive data mining in clinical medicine: current issues and guidelines, International Journal of Medical Informatics, 77, 81-97. https://doi.org/10.1016/j.ijmedinf.2006.11.006
  2. Cook, N. R. (2008). Satistical evaluation of prognostic versus diagnostic models: beyond the ROC curve, Clinical Chemistry, 54, 17-23.
  3. D'Agostino, R. B., Grundy, S., Sullivan, L. M., Wilson, P., and CHD Risk Prediction Group (2001). Validation of the Framingham coronary heart disease prediction scores: results of a multiple ethnic groups investigation, Journal of the American Medical Association, 286, 180-187. https://doi.org/10.1001/jama.286.2.180
  4. Dayton, C. M. (1992). Logistic regression analysis, Stat, 474-574.
  5. Demsar, J., Curk, T., Erjavec, A., et al. (2013). Orange: data mining toolbox in Python, Journal of Machine Learning Research, 14, 2349-2353.
  6. Heo, M. H. and Lee, Y. G. (2008). Data Mining Modeling and Example, Hannarae, Seoul.
  7. Iasonos, A., Schrag, D., Raj, G. V., and Panageas, K. S. (2008). How to build and interpret a nomogram for cancer prognosis, Journal of Clinical Oncology, 26, 1364-1370. https://doi.org/10.1200/JCO.2007.12.9791
  8. Jun, H. J. (2015). Establishment of a nomogram to predict the prognosis of metastatic or recurrent gastric cancer patients (Master's thesis), Yonsei University, Seoul.
  9. Kim, S. H., Shin, K. H., Kim, H. Y., Cho, Y. J., Noh, J. K., Suh, J. S., and Yang, W. I. (2014). Postoperative nomogram to predict the probability of metastasis in Enneking stage IIB extremity osteosarcoma, BMC Cancer, 14, 666. https://doi.org/10.1186/1471-2407-14-666
  10. Korea Centers for Disease Control and Prevention (2016). Korea Health Statistics 2015: Korea National Health and Nutrition Examination Survey (KNHANES VI-3), Cheongju, from: https://knhanes.cdc.go.kr/knhanes/sub04/sub0403.do?classType=7
  11. Korean Statistical Information Service (2015). Cause of Death, from: http://kosis.kr/statHtml/statHtml.do?orgId=101&tblId=DT1B34E01&connpath=I2
  12. Lee, J. W., Park, M. R., and Yu, H. N. (2005). Statistical Method for Bioscience Research, Freedom Academy, Seoul.
  13. Lee, S. C. and Chang, M. C. (2014). Development and validation of web-based nomogram to predict postoperative invasive component in ductal carcinoma in situ at core needle breast biopsy, Healthcare Informatics Research, 20, 152-156. https://doi.org/10.4258/hir.2014.20.2.152
  14. Mozina, M., Demsar, J., Kattan, M., and Zupan, B. (2004a). Nomogram for visualization of naive Bayesian vlassifier, Knowledge Discovery in Databases: PKDD 2004, 337-348.
  15. Mozina, M., Demsar, J., Smrke, D., and Zupan, B. (2004b). Nomograms for naive Bayesian classifiers and how can they help in medical data analysis. In Proceedings of MEDINFO 2004, 1762.
  16. Nam, B. H. and D'Agostino, R. B. (2002). Discrimination index, the area under the ROC curve, Huber-Carol C., Balakrishnan N., Nikulin M.S., Mesbah M. (eds), In Goodness-of-Fit Tests and Model Validity (pp. 267-279), Birkhauser, Boston.
  17. Seo, J. H. and Lee, J. Y. (2018). Novel nomogram based on risk factors of chronic obstructive pulmonary disease (COPD) using a naive Bayesian classifier model, Communications in Statistics - Simulation and Computation, To summitted.
  18. Seo, J. H., Oh, D. Y., Park, Y. S., and Lee, J. Y. (2017). Build the nomogram by risk factors of chronic obstructive pulmonary disease (COPD), The Korean Journal of Applied Statistics, 30, 591-602. https://doi.org/10.5351/KJAS.2017.30.4.591
  19. Steyerberg, E. W., Eijkemans, M. J., Harrell, F. E., and Habbema, J. D. (2000). Prognostic modeling with logistic regression analysis: a comparison of selection and estimation methods in small data sets, Statistics in Medicine, 19, 1059-1079. https://doi.org/10.1002/(SICI)1097-0258(20000430)19:8<1059::AID-SIM412>3.0.CO;2-0
  20. World Health Organization (2017). The top 10 cause of deat, from: http://who.int/mediacentre/factsheets/fs310/en/
  21. Yang, D. (2014). Build prognostic nomograms for risk assessment using SAS. In Proceedings of SAS Global Forum 2013, from: http://support.sas.com/resources/papers/proceedings13/264-2013.pdf.
  22. Zielinski, J., Bednarek, M., and Know the Age of Your Lung Study Group (2001). Early detection of COPD in a high-risk population using spirometric screening, Chest, 119, 731-736. https://doi.org/10.1378/chest.119.3.731