Browse > Article
http://dx.doi.org/10.9718/JBER.2007.28.3.355

Prediction of Diabetic Nephropathy from Diabetes Dataset Using Feature Selection Methods and SVM Learning  

Cho, Baek-Hwan (Department of Biomedical Engineering, Hanyang University)
Lee, Jong-Shill (Department of Biomedical Engineering, Hanyang University)
Chee, Young-Joan (Department of Biomedical Engineering, Hanyang University)
Kim, Kwang-Won (Department of Endocrinology and Metabolism, Sungkyunkwan University)
Kim, In-Young (Department of Biomedical Engineering, Hanyang University)
Kim, Sun-I. (Department of Biomedical Engineering, Hanyang University)
Publication Information
Journal of Biomedical Engineering Research / v.28, no.3, 2007 , pp. 355-362 More about this Journal
Abstract
Diabetes mellitus can cause devastating complications, which often result in disability and death, and diabetic nephropathy is a leading cause of death in people with diabetes. In this study, we tried to predict the onset of diabetic nephropathy from an irregular and unbalanced diabetic dataset. We collected clinical data from 292 patients with type 2 diabetes and performed preprocessing to extract 184 features to resolve the irregularity of the dataset. We compared several feature selection methods, such as ReliefF and sensitivity analysis, to remove redundant features and improve the classification performance. We also compared learning methods with support vector machine, such as equal cost learning and cost-sensitive learning to tackle the unbalanced problem in the dataset. The best classifier with the 39 selected features gave 0.969 of the area under the curve by receiver operation characteristics analysis, which represents that our method can predict diabetic nephropathy with high generalization performance from an irregular and unbalanced dataset, and physicians can benefit from it for predicting diabetic nephropathy.
Keywords
diabetic nephropathy; feature selection; support vector machine; cost-sensitive learning;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 H. S. Jo, J. H. Sung, J. S. Choi, M. S. Hwang, H. J. Jeong, and S. C. Bae, 'Quality control of diagnostic coding in the Korean Burden of Disease Project,' presented at Int society for quality in health care's Int Conf, Amsterdam, Netherlands, October 2064
2 C. E. Mogensen, J. Vigstrup, and N. Ehlers, 'Microalbuminuria predicts proliferative diabetic retinopathy,' Lancet, vol. 1, pp. 1512-1513, 1985   PUBMED
3 K. J. Cios and G. W. Moore, 'Uniqueness of medical data mining,' Artf Intell Med, vol. 26, pp. 1-24, 2002   DOI   ScienceOn
4 M. Stevensen, R. Winter, and B. Widrow, 'Sensitivity of feed forward neural networks to weight errors,' IEEE Trans. Neural Networks, vol. 1, pp. 71-80, 1990   DOI
5 H. S. Choi, Y. H. Cho, B. H. Cho, W. K. Moon, J. G. Im, I. Y. Kim, and S. I. Kim, 'A study on the multi-view based computer aided diagnosis in digital mammography,' Journal of Biomedical Engineering Research, vol. 28, pp. 162-168, 2007   과학기술학회마을
6 C. J. C. Burges, 'A tutorial on support vector machines for pattern recognition,' Data Mining and Knowledge Discovery, vol. 2, pp. 121-167, 1998   DOI   ScienceOn
7 K. G. Alberti and P. Z. Zimmet, 'Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus provisional report of a WHO consultation,' Diabet Med, vol. 17, pp. 539-553, 1998
8 P. Rossing, P. Hougaard, K. Borch-Johnsen, and H. Parving, 'Predictors of mortality in insulin dependent diabetes: 10 year observational follow up study,' BMJ, vol. 313, pp. 779-784, 1996   DOI   PUBMED   ScienceOn
9 A. B. Magil and A. H. Cohen, 'Monocytes and focal glomerulosclerosis,' Laboratory Investigation, vol. 61, pp. 404-409, 1989   PUBMED
10 C. Elkan, 'The foundations of cost-sensitive learning,' in Proc.17th Int Joint Conf Artif Intell, Seattle, WA, August 2000
11 G. Sterner, J. Carlson, and G. Ekberg, 'Raised platelet levels in diabetes mellitus complicated with nephropathy,' Journal cf Internal Medicine, vol. 244, pp. 437-441, 1998   DOI   ScienceOn
12 V. Vapnik, The Nature of Statistical Learning Theory, New York: Springer, 1995
13 I. Kononenko, 'Estimating attributes: analysis and extensions cf relief,' in Proc. ECML'94, Catania, Italy, April 1994
14 T. Onuma, T. Kikuch, M. Tsutsui, S. Shimura, J. Matsui, A. Boku, and K. Takebe, 'High incidence of diabetic nephropathy in non-insulin-dependent diabetic patients with heterozygous familial hypercholesterolemia,' Current therapeutic research, vol. 55, pp. 532-536, 1994   DOI   ScienceOn
15 I. Guyon and A. Elisseeff, 'An introduction to variable and feature selection,' J. Mach. Learn. Res., vol. 3, pp. 1157-1182, 2003   DOI
16 K. Veropoulos, N. Cristianini, and C. Campbell, 'Controlling the sensitivity of support vector machines,' in Proc. the Int Joint Conf Artif Intell, Stockholm, Sweden, August 1999
17 T. Furuta, T. Saito, T. Ootaka, J. Soma, K. Obara, K. Abe, and R. Yshinaga, 'The role of macrophages in diabetic glomeruloscelerosis,' American Journal of Kidney Diseases, vol. 21, pp. 480-485, 1993   DOI   PUBMED
18 J. C. Platt, Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods, in Advances in Large Margin Classifiers: MIT Press, 1999