DOI QR코드

DOI QR Code

Prediction Model for Hypertriglyceridemia Based on Naive Bayes Using Facial Characteristics

안면 정보를 이용한 나이브 베이즈 기반 고중성지방혈증 예측 모델

  • 이주원 (한국한의학연구원 미래의학부) ;
  • 이범주 (한국한의학연구원 미래의학부)
  • Received : 2019.06.14
  • Accepted : 2019.08.13
  • Published : 2019.11.30

Abstract

Recently, machine learning and data mining have been used for many disease prediction and diagnosis. Chronic diseases account for about 80% of the total mortality rate and are increasing gradually. In previous studies, the predictive model for chronic diseases use data such as blood glucose, blood pressure, and insulin levels. In this paper, world's first research, verifies the relationship between dyslipidemia and facial characteristics, and develops the predictive model using machine learning based facial characteristics. Clinical data were obtained from 5390 adult Korean men, and using hypertriglyceridemia and facial characteristics data. Hypertriglyceridemia is a measure of dyslipidemia. The result of this study, find the facial characteristics that highly correlated with hypertriglyceridemia. FD_43_143_aD (p<0.0001, Area Under the receiver operating characteristics Curve(AUC)=0.652) is the best indicator of this study. FD_43_143_aD means distance between mandibular. The model based on this result obtained AUC value of 0.662. These results will provide a basis for predicting various diseases with only facial characteristics in the screening stage of disease epidemiology and public health in the future.

최근에 이르러, 기계학습 및 데이터마이닝은 수많은 질병 예측 및 진단에 활용되고 있다. 만성질환은 전체 사망률의 약 80%를 차지하는 질병으로, 점점 증가하는 추세이다. 만성질환 관련 예측 모델을 연구한 기존 연구들은 예측 모델을 구성하는 데이터로 혈당, 혈압, 인슐린 수치 등의 건강검진 수준의 데이터를 이용한다. 본 논문은 만성질환의 위험 요인인 이상지질혈증과 안면 정보의 연관성을 검증하고, 기계학습 기반 안면 정보를 이용한 이상지질혈증 예측 모델을 세계 최초로 개발한다. 본 연구는 5390명의 임상 데이터 중 안면 정보와 중성지방혈증 정보를 바탕으로 수행하였다. 중성지방혈증은 이상지질혈증을 판단하는 척도이다. 연구의 결과로 얼굴의 하악(mandibular) 간의 거리를 나타내는 FD_43_143_aD(p<0.0001, Area Under the receiver operating characteristics Curve(AUC)=0.652) 와 고중성지방혈증이 매우 높은 연관성을 가진 것을 밝혀냈고, 이를 기반으로 구축한 모델은 0.662의 AUC값을 획득하였다. 이러한 연구결과는 향후 질병 역학 및 대중 보건 영역의 스크리닝 단계에서 안면정보만으로 다양할 질병을 예측할 수 있는 기반을 제공할 수 있을 것이다.

Keywords

References

  1. E. G. Jeong, "The Status and Issues of Chronic Diseases (2018)", Korea Centers for Disease Control & Prevention(KCDC), 2018.
  2. G. Kaur and A. Chhabra, "Improved J48 Classification Algorithm for the Prediction of Diabetes," International Journal of Computer Applications, Vol.98, No.22, pp.13-17, 2014. https://doi.org/10.5120/17314-7433
  3. A. H. Chen, S. Y. Huang, P. S. Hong, C. H. Cheng, and E. J. Lin, "HDPS: Heart Disease Prediction System," 2011 Computing in Cardiology. IEEE, pp.557-560, 2011.
  4. R. Chitra and V. Seenivasagam, "Heart Disease Prediction System using Supervised Learning Classifier," Bonfring International Journal of Software Engineering and Soft Computing, Vol.3, No.1, pp.1-7, 2013. https://doi.org/10.9756/BIJSESC.4336
  5. J. Soni, U. Ansar, D. Sharma, and S. Soni, "Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction," International Journal of Computer Applications, Vol.17, No.8, pp.43-48, 2011. https://doi.org/10.5120/2237-2860
  6. B. J. Lee and J. Y. Kim, "Indicators of Hypertriglyceridemia from Anthropometric Measures Based on Data Mining," Computers in Biology and Medicine, Vol.57, pp.201-211, 2015. https://doi.org/10.1016/j.compbiomed.2014.12.005
  7. B. J. Lee, J. H. Do, and J. Y. Kim, "A Classification Method of Normal and Overweight Females Based on Facial Features for Automated Medical Applications," BioMed Research International, pp.1-9, 2012.
  8. B. J. Lee and J. Y. Kim, "Predicting Visceral Obesity Based on Facial Characteristics," BMC Complementary and Alternative Medicine, Vol.14, No.248, 2014.
  9. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten, "The WEKA Data Mining Software: an Update," ACM SIGKDD Explorations Newsletter, Vol.11, No.1, pp.10-18, 2009. https://doi.org/10.1145/1656274.1656278
  10. T. M. Mitchell, "Machine Learning," McGraw-Hill, pp. 154-200, 1997. BioMed Research International, pp.1-9, 2012.
  11. C. Beyene and P. Kamat, "Survey on Prediction and Analysis the Occurrence of Heart Disease Using Data Mining Techniques," International Journal of Pure and Applied Mathematics, Vol.118, No.8, pp.165-174, 2018.
  12. M. A. Hall, "Correlation-based Feature Selection for Machine learning," Ph. D. Dissertation, University of Waikato, Hamilton, NewZealand, 1999.
  13. M. Hall and G. Holmes, "Benchmarking Attribute Selection Techniques for Discrete Data Class Data Mining", IEEE Trans. Knowl. DataEng., Vol.15, No.6, pp.1437-1447, 2003. https://doi.org/10.1109/TKDE.2003.1245283
  14. R. Kohavi and G. H. John, "Wrappers for Feature Subset Selection," Artificial Intelligence, Vol.97, Issues 1-2, pp.273-324, 1997. https://doi.org/10.1016/S0004-3702(97)00043-X
  15. I. Guyon and A. Elisseeff, "An Introduction to Variable and Feature Selection," Journal of Machine Learning Research, Vol.3. pp.1157-1182, 2003.
  16. N. Tan, M. Steinbach, and V. Kumar, "Introduction to Data Mining," Addison-Wesley Longman pub, Boston, pp.295-304, 2006.