Browse > Article
http://dx.doi.org/10.13088/jiis.2015.21.2.173

A Hybrid Under-sampling Approach for Better Bankruptcy Prediction  

Kim, Taehoon (Graduate School of Business IT, Kookmin University)
Ahn, Hyunchul (Graduate School of Business IT, Kookmin University)
Publication Information
Journal of Intelligence and Information Systems / v.21, no.2, 2015 , pp. 173-190 More about this Journal
Abstract
The purpose of this study is to improve bankruptcy prediction models by using a novel hybrid under-sampling approach. Most prior studies have tried to enhance the accuracy of bankruptcy prediction models by improving the classification methods involved. In contrast, we focus on appropriate data preprocessing as a means of enhancing accuracy. In particular, we aim to develop an effective sampling approach for bankruptcy prediction, since most prediction models suffer from class imbalance problems. The approach proposed in this study is a hybrid under-sampling method that combines the k-Reverse Nearest Neighbor (k-RNN) and one-class support vector machine (OCSVM) approaches. k-RNN can effectively eliminate outliers, while OCSVM contributes to the selection of informative training samples from majority class data. To validate our proposed approach, we have applied it to data from H Bank's non-external auditing companies in Korea, and compared the performances of the classifiers with the proposed under-sampling and random sampling data. The empirical results show that the proposed under-sampling approach generally improves the accuracy of classifiers, such as logistic regression, discriminant analysis, decision tree, and support vector machines. They also show that the proposed under-sampling approach reduces the risk of false negative errors, which lead to higher misclassification costs.
Keywords
Bankruptcy Prediction; Under-sampling; k-Reverse Nearest Neighbor; One-class Support Vector Machine; Classification;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Ahn, H., and K.-j. Kim., "Corporate Bond Rating using Various Multiclass Support Vector Machines." Asia Pacific Journal of Information Systems, Vol.19, No.2(2009), 157-178.
2 Altman, E. I., "Financial ratios, discriminant analysis and the prediction of corporate bankruptcy," The Journal of Finance, Vol.23, No.4(1968), 589-609.   DOI
3 Anitha, R., and S. Santhi., "Minority Oversampling Technique for Imbalanced Dataset Learning Using Agglomerative Clustering," International Journal of Emerging Technology and Innovative Engineering, Vol. 1, No.3(2015), 137-142.
4 Bellovary, J. L., D. E. Giacomino, and M. D. Aker, "A Review of Bankru,ptcy Prediction Studies: 1930 to Present," Journal of Financial Education, Vol.33, No.4(2007), 1-43.
5 Chang, C. -C. and C.-J. Lin, "LIBSVM : a library for support vector machines," ACM Transactions on Intelligent Systems and Technology, Vol.2, No.3(2011), 1-27. Software available at http://www.csie.ntu.edu.tw/-cjlin/libsvm.
6 Chawla, N. V., K. W. Bowyer, and L. O. Hall, W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," Journal of Artificial Intelligence Research, Vol.16(2002), 321-547.
7 Choi, S. Y., and H. Ahn, "Optimized Bankruptcy Prediction through Combining SVM with Fuzzy Theory," Journal of Digital Convergence, Vol.13, No.3(2015), 155-165.   DOI
8 Deakin, E., "A Discriminant Analysis of Predictors of Business Failure," Journal of Accounting, Vol.10, No.1(1974), 167-179.
9 Garcia, V., J. S. Sanchez, and R. A. Mollineda, "On the effectiveness of preprocessing methods when dealing with different levels of class imbalance," Knowledge-Based Systems, Vol. 25(2012), 13-21.   DOI
10 Hart, P. E., "The Condensed Nearest Neighbor Rule," IEEE Transactions on Information Theory, Vol. 18, (1968), 515-516.
11 Jindaluang, W., V. Chouvatut, and S. Kantabutra, "Under-sampling by algorithm with performance guaranteed for class-imbalance problem," Computer Science and Engineering Conference (ICSEC), (2014), 215-221.
12 Kim, M. J., D. K. Kang, and H.B. Kim, "Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction," Expert Systems with Applications, Vol.42, No.3(2015), 1074-1082.   DOI
13 Kim, S., C. S. Park, and S. M. Jeon, "Default Decisions of FIs and Endogeneity Problems in Default Prediction," Journal of Business Research, Vol.26, No.1(2011), 99-132.
14 Kotsiantis, S., D. Tzelepis, E. Koumanakos, and V. Tampakas, "Selective costing voting for bankruptcy prediction," International Journal of Knowledge-based and Intelligent Engineering Systems, Vol.11(2007), 115-127.   DOI
15 Kumar, P. and V. Ravi, "Bankruptcy prediction in banks and firms via statistical and intelligent techniques-A review," European Journal of Operational Research, Vol.180, No.1(2007), 1-28.   DOI
16 Kumar, P., P. R. Krishna, and S. B. Raju, Pattern Discovery Using Sequence Data Mining: Applications and Studies: Applications and Studies, IGI Global, Hershey, Pennsylvania, 2011.
17 Lee, J. S. and J. G. Kwon, "A Hybrid SVM Classifier for Imbalanced Data Sets," Journal of Intelligence and Information Systems, Vol.19, No.2(2013), 125-140.   DOI
18 Liu, A., J. Ghosh, and C. E. Martin, "Generative Oversampling for Mining Imbalanced Datasets," Proceedings of the 2007 International Conference on Data Mining, (2007), 66-72.
19 Min, J. H. and Y.-C. Lee, "Bankruptcy Prediction Using Support Vector Machine with Optimal Choice of Kernel Function Parameters," Expert Systems with Applications, Vol.28, No.4(2005), 603-614.   DOI
20 Liu, X. Y., J. Wu, and Z. H. Zhou, "Exploratory undersampling for class-imbalance learning," IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol.39 No. 2(2009), 539-550.   DOI
21 Ng, W. W., J. Hu, D. S. Yeung, S. Yin, and F. Roli, F, "Diversified Sensitivity-Based Undersampling for Imbalance Classification Problems," IEEE Transactions on Cybernetics, (2015), Forthcoming.
22 Odom, M. D., and R. Sharda, "A Neural Network Model For Bankruptcy Prediction," Proceedings of the International Joint Conference on Neural networks, Vol.2(1990), 163-168.
23 Ohlson, J. A., "Financial Ratios and the Probabilistic Prediction of Bankruptcy," Journal of Accounting Research, Vol.18, No.1(1980), 109-131.   DOI
24 Park, J.-m., K.-j. Kim, and I. Han, "Bankruptcy Prediction using Support Vector Machines," Asia Pacific Journal of Information Systems, Vol.15, No.2(2005), 51-63.
25 Serrano-Cinsa, C., "Self organizing neural networks for financial diagnosis," Decision Support Systems, Vol.17, No.3(1996), 227-238.   DOI
26 Shin, K.-S., T. S. Lee, and H.-j. Kim, "An application of support vector machines in bankruptcy prediction model," Expert Systems with Applications, Vol.28, No.1(2005), 127-135.   DOI
27 Shin, T. and T. Hong, "Corporate Credit Rating based on Bankruptcy Probability Using AdaBoost Algorithm-based Support Vector Machine," Journal of Intelligence and Information Systems, Vol.17, No.3(2011), 25-41.
28 Tai, Q.-y., and K.-s. Shin, "GA-based Normalization Approach in Backpropagation Neural Network for Bankruptcy Prediction Modeling," Journal of Intelligence and Information Systems, Vol.15, No.3(2009), 1-14.
29 Soujanya, V., R. V. Satyanarayana, and K. Kamalakar, "A Simple Yet Effective Data Clustering Algorithm," Proceedings of the Sixth International Conference on Data Mining(ICDM'06), Hong Kong, (2006), 1108-1112.
30 Sundarkumar, G. G. and V. Ravi, "A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance," Engineering Applications of Artificial Intelligence, Vol.37, (2015), 368-377.   DOI
31 Tam, K. Y. and M. Y. Kiang, " Managerial Applications of Neural Networks : The Case of Bank Failure Predictions," Management science, Vol.38, No.7(1992), 926-947.   DOI
32 Tax, D. M. J., and R. P. W. Duin, "Support Vector Data Description," Machine Learning, Vol. 54, No.1(2004), 45-66.   DOI
33 Vapnik, V. N., Statistical Learning Theory, John Wiley & Sons, New York, 1998.
34 Wang, D., and M. Shi, "Density Weighted Region Growing Method for Imbalanced Data SVM Classification in Under-sampling Approaches," Journal of Information & Computational Science, Vol.11, No.18(2014), 6673-6680.   DOI
35 Yang, J., and V. Honavar, "Feature Subset Selection Using a Genetic Algorithm," Computer Science Technical Reports, (1997), Paper 156.
36 Zhou, L., "Performance of corporate bankruptcy prediction models on imbalanced dataset: The effect of sampling method," Knowledge-Based Systems, Vol.41(2013), 16-25.   DOI
37 Zhou, L., K. K. Lai, and J. Yen, "Bankruptcy prediction using SVM models with a new approach to combine features selection and parameter optimisation," International Journal of Systems Science, Vol.45, No.3(2014), 241-253.   DOI