DOI QR코드

DOI QR Code

A Research on Accuracy Improvement of Diabetes Recognition Factors Based on XGBoost

  • Shin, Yongsub (Graduate School of Smart Convergence Kwangwoon University) ;
  • Yun, Dai Yeol (Department of Plasma Bioscience and Display, KwangWoon University) ;
  • Moon, Seok-Jae (Department of Computer Science, Kwangwoon University) ;
  • Hwang, Chi-gon (Department of Computer Engineering, Institute of Information Technology, Kwangwoon University)
  • 투고 : 2021.04.22
  • 심사 : 2021.05.04
  • 발행 : 2021.06.30

초록

Recently, the number of people who visit the hospital due to diabetes is increasing. According to the Korean Diabetes Association, it is statistically indicated that one in seven adults aged 30 years or older in Korea suffers from diabetes, and it is expected to be more if the pre-diabetes, fasting blood sugar disorders, are combined. In the last study, the validity of Triglyceride and Cholesterol associated with diabetes was confirmed and analyzed using Random Forest. Random Forest has a disadvantage that as the amount of data increases, it uses more memory and slows down the speed. Therefore, in this paper, we compared and analyzed Random Forest and XGBoost, focusing on improvement of learning speed and prevention of memory waste, which are mainly dealt with in machine learning. Using XGBoost, the problem of slowing down and wasting memory was solved, and the accuracy of the diabetes recognition factor was further increased.

키워드

참고문헌

  1. Krishnasamy, S, Abell, T. L. (2018). Diabetic gastroparesis: principles and current trends in management. Diabetes Therapy, 9(1), 1-42., https://doi.org/10.6084/m9.figshare.6391592.
  2. sub Shin, Yong, Namju Lee, and Chigon Hwang. "A research on the key factors for classification of diabetes based on random forest." International Journal of Internet, Broadcasting and Communication 12.3: 102-107, https://doi.org/10.7236/IJIBC.2020.12.3.102
  3. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. , https://doi.org/10.1023/A:1010933404324
  4. Sasaki, Yutaka. "The truth of the f-measure. 2007." (2007): 16.
  5. The Institute of Internet, Broadcasting and Communication, Submission of manuscript. http://www.iibc.kr.
  6. Minjin Lee, & Sang soo Kim. (2017). Obesity management in diabetics. Journal of Korean Diabetes, 18(4)., https://doi.org/10.1053/beem.1999.0017
  7. Mitchell, T. M. (1999). Machine learning and data mining. Communications of the ACM, 42(11), 30-36. https://doi.org/10.1145/319382.319388
  8. R, SAS, Data mining using MS-SQL / freeaca / Jeong jin Lee
  9. Chen, Tianqi, and Carlos Guestrin. "Xgboost: A scalable tree boosting system." Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016, https://doi.org/10.1145/2939672.2939785