DOI QR코드

DOI QR Code

특징점 선택방법과 SVM 학습법을 이용한 당뇨병 데이터에서의 당뇨병성 신장합병증의 예측

Prediction of Diabetic Nephropathy from Diabetes Dataset Using Feature Selection Methods and SVM Learning

  • 조백환 (한양대학교 의용생체공학과) ;
  • 이종실 (한양대학교 의용생체공학과) ;
  • 지영준 (한양대학교 의용생체공학과) ;
  • 김광원 (성균관대학교 의과대학 내분비 대사 내과) ;
  • 김인영 (한양대학교 의용생체공학과) ;
  • 김선일 (한양대학교 의용생체공학과)
  • Cho, Baek-Hwan (Department of Biomedical Engineering, Hanyang University) ;
  • Lee, Jong-Shill (Department of Biomedical Engineering, Hanyang University) ;
  • Chee, Young-Joan (Department of Biomedical Engineering, Hanyang University) ;
  • Kim, Kwang-Won (Department of Endocrinology and Metabolism, Sungkyunkwan University) ;
  • Kim, In-Young (Department of Biomedical Engineering, Hanyang University) ;
  • Kim, Sun-I. (Department of Biomedical Engineering, Hanyang University)
  • 발행 : 2007.06.30

초록

Diabetes mellitus can cause devastating complications, which often result in disability and death, and diabetic nephropathy is a leading cause of death in people with diabetes. In this study, we tried to predict the onset of diabetic nephropathy from an irregular and unbalanced diabetic dataset. We collected clinical data from 292 patients with type 2 diabetes and performed preprocessing to extract 184 features to resolve the irregularity of the dataset. We compared several feature selection methods, such as ReliefF and sensitivity analysis, to remove redundant features and improve the classification performance. We also compared learning methods with support vector machine, such as equal cost learning and cost-sensitive learning to tackle the unbalanced problem in the dataset. The best classifier with the 39 selected features gave 0.969 of the area under the curve by receiver operation characteristics analysis, which represents that our method can predict diabetic nephropathy with high generalization performance from an irregular and unbalanced dataset, and physicians can benefit from it for predicting diabetic nephropathy.

키워드

참고문헌

  1. K. G. Alberti and P. Z. Zimmet, 'Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus provisional report of a WHO consultation,' Diabet Med, vol. 17, pp. 539-553, 1998
  2. H. S. Jo, J. H. Sung, J. S. Choi, M. S. Hwang, H. J. Jeong, and S. C. Bae, 'Quality control of diagnostic coding in the Korean Burden of Disease Project,' presented at Int society for quality in health care's Int Conf, Amsterdam, Netherlands, October 2064
  3. C. E. Mogensen, J. Vigstrup, and N. Ehlers, 'Microalbuminuria predicts proliferative diabetic retinopathy,' Lancet, vol. 1, pp. 1512-1513, 1985
  4. P. Rossing, P. Hougaard, K. Borch-Johnsen, and H. Parving, 'Predictors of mortality in insulin dependent diabetes: 10 year observational follow up study,' BMJ, vol. 313, pp. 779-784, 1996 https://doi.org/10.1136/bmj.313.7060.779
  5. T. Furuta, T. Saito, T. Ootaka, J. Soma, K. Obara, K. Abe, and R. Yshinaga, 'The role of macrophages in diabetic glomeruloscelerosis,' American Journal of Kidney Diseases, vol. 21, pp. 480-485, 1993 https://doi.org/10.1016/S0272-6386(12)80393-3
  6. G. Sterner, J. Carlson, and G. Ekberg, 'Raised platelet levels in diabetes mellitus complicated with nephropathy,' Journal cf Internal Medicine, vol. 244, pp. 437-441, 1998 https://doi.org/10.1111/j.1365-2796.1998.00349.x
  7. T. Onuma, T. Kikuch, M. Tsutsui, S. Shimura, J. Matsui, A. Boku, and K. Takebe, 'High incidence of diabetic nephropathy in non-insulin-dependent diabetic patients with heterozygous familial hypercholesterolemia,' Current therapeutic research, vol. 55, pp. 532-536, 1994 https://doi.org/10.1016/S0011-393X(05)80183-3
  8. K. J. Cios and G. W. Moore, 'Uniqueness of medical data mining,' Artf Intell Med, vol. 26, pp. 1-24, 2002 https://doi.org/10.1016/S0933-3657(02)00049-0
  9. C. Elkan, 'The foundations of cost-sensitive learning,' in Proc.17th Int Joint Conf Artif Intell, Seattle, WA, August 2000
  10. I. Guyon and A. Elisseeff, 'An introduction to variable and feature selection,' J. Mach. Learn. Res., vol. 3, pp. 1157-1182, 2003 https://doi.org/10.1162/153244303322753616
  11. I. Kononenko, 'Estimating attributes: analysis and extensions cf relief,' in Proc. ECML'94, Catania, Italy, April 1994
  12. M. Stevensen, R. Winter, and B. Widrow, 'Sensitivity of feed forward neural networks to weight errors,' IEEE Trans. Neural Networks, vol. 1, pp. 71-80, 1990 https://doi.org/10.1109/72.80206
  13. V. Vapnik, The Nature of Statistical Learning Theory, New York: Springer, 1995
  14. C. J. C. Burges, 'A tutorial on support vector machines for pattern recognition,' Data Mining and Knowledge Discovery, vol. 2, pp. 121-167, 1998 https://doi.org/10.1023/A:1009715923555
  15. A. B. Magil and A. H. Cohen, 'Monocytes and focal glomerulosclerosis,' Laboratory Investigation, vol. 61, pp. 404-409, 1989
  16. K. Veropoulos, N. Cristianini, and C. Campbell, 'Controlling the sensitivity of support vector machines,' in Proc. the Int Joint Conf Artif Intell, Stockholm, Sweden, August 1999
  17. H. S. Choi, Y. H. Cho, B. H. Cho, W. K. Moon, J. G. Im, I. Y. Kim, and S. I. Kim, 'A study on the multi-view based computer aided diagnosis in digital mammography,' Journal of Biomedical Engineering Research, vol. 28, pp. 162-168, 2007
  18. J. C. Platt, Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods, in Advances in Large Margin Classifiers: MIT Press, 1999