Browse > Article
http://dx.doi.org/10.5392/IJoC.2012.8.2.007

Improving the Error Back-Propagation Algorithm for Imbalanced Data Sets  

Oh, Sang-Hoon (Department of Information Communication Engineering Mokwon University)
Publication Information
Abstract
Imbalanced data sets are difficult to be classified since most classifiers are developed based on the assumption that class distributions are well-balanced. In order to improve the error back-propagation algorithm for the classification of imbalanced data sets, a new error function is proposed. The error function controls weight-updating with regards to the classes in which the training samples are. This has the effect that samples in the minority class have a greater chance to be classified but samples in the majority class have a less chance to be classified. The proposed method is compared with the two-phase, threshold-moving, and target node methods through simulations in a mammography data set and the proposed method attains the best results.
Keywords
Imbalanced Data; Error Back-Propagation; Error Function; Mammography;
Citations & Related Records
연도 인용수 순위
  • Reference
1 P. Kang and S. Cho, "EUS SVMs: ensemble of under-sampled SVMs for data imbalance problem, " Proc. ICONIP'06, 2006, p. 837-846.
2 L. Bruzzone and S. B. Serpico, "Classification of Remote-Sensing Data by Neural Networks," Pattern Recognition Letters, vol.18, 1997, pp. 1323-1328.   DOI   ScienceOn
3 Z.-H. Zhou and X.-Y. Liu, "Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem," IEEE Trans. Know. and Data Eng., vol.18, no. 1, Jan. 2006, pp. 63-77.   DOI   ScienceOn
4 S.-H. Oh, "Improving the Error Back-Propagation Algorithm with a Modified Error Function," IEEE Trans. Neural Networks, vol.8, 1997, pp. 799-803.   DOI   ScienceOn
5 S.-H. Oh, "Error Back-Propagation Algorithm for Classification of Imbalanced Data," Neurocomputing, vol.74, 2011, pp. 1058-1061.   DOI   ScienceOn
6 H. White, "Learning in Artificial Neural Networks: A Statistical Perspective," Neural Computation, vol.1, no.4, Winter 1989, pp. 425-464.   DOI
7 S.-H. Oh, "A Statistical Perspective of Neural Networks for Imbalanced Data Problems," Int. Journal of Contents, vol.7,2011,pp.1-5.
8 A. van Ooyen and B. Nienhuis, "Improving the convergence of the backpropagation algorithm," Neural Networks, vol.5, 1992, pp. 465-471.   DOI   ScienceOn
9 Y.-M. Huang, C.-M. Hung, and H. C. Jiau, "Evaluation of Neural Networks and Data Mining Methods on a Credit Assessment Task for Class Imbalance Problem," Nonlinear Analysis, vol.7, 2006, pp. 720-747.   DOI   ScienceOn
10 H. Zhao, "Instance Weighting versus Threshold Adjusting for Cost-Sensitive Classification," Knowledge and Information Systems, vol.15, 2008, pp. 321-334.   DOI
11 R. Bi, Y. Zhou, F. Lu, and W. Wang, "Predicting gene ontology functions based on support vector machines and statistical significance estimation," Neurocomputing, vol.70, 2007, pp.718-725.   DOI   ScienceOn
12 N. V. Chawla, K. W. Bowyer, L. O. all, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," J. Artificial Intelligence Research, vol.16, 2002, pp. 321-357.
13 F. Provost and T. Fawcett, "Robust Classification for Imprecise Environments," Machine Learning, vol.42, 2001, pp. 203-231.   DOI
14 D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing, Cambridge, MA, 1986.