Browse > Article

Feature Selection for Multi-Class Support Vector Machines Using an Impurity Measure of Classification Trees: An Application to the Credit Rating of S&P 500 Companies  

Hong, Tae-Ho (School of Business, Pusan National University)
Park, Ji-Young (BK21 Research and Education Institute, Pusan National University)
Publication Information
Asia pacific journal of information systems / v.21, no.2, 2011 , pp. 43-58 More about this Journal
Abstract
Support vector machines (SVMs), a machine learning technique, has been applied to not only binary classification problems such as bankruptcy prediction but also multi-class problems such as corporate credit ratings. However, in general, the performance of SVMs can be easily worse than the best alternative model to SVMs according to the selection of predictors, even though SVMs has the distinguishing feature of successfully classifying and predicting in a lot of dichotomous or multi-class problems. For overcoming the weakness of SVMs, this study has proposed an approach for selecting features for multi-class SVMs that utilize the impurity measures of classification trees. For the selection of the input features, we employed the C4.5 and CART algorithms, including the stepwise method of discriminant analysis, which is a well-known method for selecting features. We have built a multi-class SVMs model for credit rating using the above method and presented experimental results with data regarding S&P 500 companies.
Keywords
Information Technology; Feature Selection; Multi-class SVMs; Impurity Measures; Credit Rating;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Tay, F.E.H. and Cao, L., "Application of support vector machines in financial time series forecasting," Omega, Vol. 29, 2001, pp. 309-317.   DOI   ScienceOn
2 Vapnik, V. The nature of statistical learning theory, New York: Springer-Verlag, 1995.
3 Min, J.H. and Lee, Y.C., "Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters," Expert Systems with Application, Vol. 28, No. 4, 2005, pp. 603-614.   DOI   ScienceOn
4 Quinlan, J.R., C4.5: Programs for Machine Learning, Los Altos, California: Morgan Kaufmann Publishers, Inc., 1993.
5 Shin, K.-S. and Han, I., "A case-based approach using inductive indexing for corporate bond rating," Decision Support Systems, Vol. 32, No. 1, 2001, pp. 41-52.   DOI   ScienceOn
6 Kim, J.W., "Expert systems for bond rating: a comparative analysis of statistical, rulebased and neural network systems," Expert Systems, Vol. 10, No. 3, 1993, pp. 167-171.   DOI   ScienceOn
7 Huang, C. and Wang, C., "A GA-based feature selection and parameters optimization for support vector machines," Expert Systems with Applications, Vol. 31, No. 2, 2006, pp. 231-240.   DOI   ScienceOn
8 Huang, Z., Chen, H., Hsu, C.-J., Chen, W.-H., and Wu, S., "Credit rating analysis with support vector machines and neural networks: a market comparative study," Decision Support Systems, Vol. 37, No. 4, 2004, pp. 543-558.   DOI   ScienceOn
9 Park, J. and Hong, T., "The prediction of DEA based efficiency rating for venture business using multi-class SVM," Asia Pacific Journal of Information Systems, Vol. 19, No. 2, 2009, pp. 139-155.
10 Kim, K.-J., "Financial time series forecasting using support vector machines," Neurocomputing, Vol. 55, No. 1-2, 2003, pp. 307-319.   DOI   ScienceOn
11 Berry, M.J.A. and Linoff, G.S., Data Mining Techniques, New York: Wiley, 1997.
12 Gestel, T.V., Baesens, B., Dijcje, P.V., Garcia, J., Suykens, J.A.K., and Vanthienen, J., "A Process model to develop an internal rating system: sovereign credit ratings," Decision Support Systems, Vol. 42, No. 2, 2006, pp. 1131-1151.   DOI   ScienceOn
13 Ahn, H. and Kim, K.-J., "Corporate Bond Rating using Various Multiclass Support Vector Machines," Asia Pacific Journal of Information Systems, Vol. 19, No. 2, 2009, pp. 157-178.
14 Ahn, H., Kim, K.-J., and Han, I., "Intelligent Credit Rating Model for Korean Companies using Multiclass Support Vector Machines," Korean Management Review, Vol. 35, No. 5, 2006, pp. 1479-1496.
15 Shin, K.S., Lee, T.S., and Kim, H.J., "An application of support vector machines in bankruptcy prediction model," Expert Systems with Application, Vol. 28, No. 1, 2005, pp. 127-135.   DOI   ScienceOn
16 Weston, J. and Watkins, C., Multi-class support vector machines, Presented at the Proc. ESAMM99, M. Verleysen, Ed., Brussels, Belgium, 1999.
17 Wu, C.H., Tzeng, G.H., Goo, Y.J., and Fang, W.C., "A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy," Expert Systems with Application, Vol. 32, No. 2, 2007, pp. 397-408.   DOI   ScienceOn
18 Shin, K.S. and Lee, Y.J., "A genetic algorithm application in bankruptcy prediction modeling," Expert Systems with Applications, Vol. 23, No. 3, 2002, pp. 321-328.   DOI   ScienceOn
19 Kim, K. and Ahn, H., "Customer level classification model using ordinal multiclass support vector machines," Asia Pacific Journal of Information Systems, Vol. 20, No. 2, 2010, pp. 21-37.
20 Lee, Y.-C., "Application of support vector machines to corporate credit rating prediction," Expert Systems with Applications, Vol. 33, No. 1, 2007, pp. 67-74.   DOI   ScienceOn
21 Guyong, I. and Elisseeff, A., "An Introduction to Variable and Feature Selection," Journal of Machine Learning Research, Vol. 3, 2003, pp. 1157-1182.
22 Hsu, C.W. and Lin, C.J., "A Comparison of Methods for Multiclass Support Vector Machines," IEEE Transactions on Neural Networks, Vol. 13, No. 2, 2002, pp. 415-425.   DOI   ScienceOn
23 Chang, C.-C. and Lin, C.-J., BSVM: http://www.csie.ntu.edu.tw/-cjlin/bsvm/, 2006.
24 Blum, A.L. and Langley, P., "Selection of Relevant Features and Examples in Machine Learning," Artificial Intelligence, Vol. 97, No. 1-2, 1997, pp. 245-271.   DOI   ScienceOn
25 Breiman, L., Friedman, J.H., Olshen, R.A., and Stone, C.J., Classification and Regression Trees. Wadsworth International Group, 1984.
26 Cardie, C., "Using Decision Trees to Improve Case-Based Learning," Proceedings of the Tenth International Conference on Machine Learning, Morgan Kaufmann, 1993, pp. 25-32.
27 Sugumara, V., Sabareesh, G.R., and Ramachandran, K.I., "Fault diagnostics of roller bearing using kernel based neighborhood score multi-class support vector machine," Expert Systems with Applications, Vol. 34, No. 4, 2008, pp. 3090-3098.   DOI   ScienceOn
28 Shmueli, G., Patel, N.R., and Bruce, P.C. Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner, Wiley InterScience, 2007.
29 Singleton, J.C. and Surkan, A.J., "Neural networks for bond rating improved by multiple hidden layers," In Proceedings of IEEE International Conference on Neural Networks, 1990, pp. 163-168.
30 Su, C.-T. and Yang, C.-H., "Feature selection for the SVM: An application to hypertension diagnosis," Expert Systems with Applications, Vol. 34, No. 1, 2008, pp. 754-763.   DOI   ScienceOn
31 Tan, P., Steinbach, M., and Kumar, V. Introduction to Data Mining, Addison Wesley, 2006.
32 Crammer, K. and Singer, Y., "On the learn-ability and design of output codes for multiclass problem," In proceedings of the Thirteenth Annual Conference on Computational Learning Theory, 2000, pp. 35-46.
33 Lin, S.-W., Shiue, Y.-R., Chen, S.-C., and Cheng, H.-M., "Applying enhanced data mining approaches in predicting bank performance: A case of Taiwanese commercial banks," Expert Systems with Applications, Vol. 36, No. 9, 2009, pp. 11543-11551.   DOI   ScienceOn
34 Liu, H. and Motoda. H. Feature Extraction, Construction and Selection: A Data Mining Perspective, Kluwer Academic Publishers, 1998.
35 Mao, K.Z., "Feature subset selection for support vector machines through discriminative function pruning analysis," IEEE Transactions on Systems, Man, and Cybernetics, Vol. 34, No. 1, 2004, pp. 60-67.   DOI   ScienceOn
36 Huang, C.-L., Liao, H.-C., and Chen, M.-C., "Prediction model building and feature selection with support vector machines in breast cancer diagnosis," Expert Systems with Applications, Vol. 34, No. 1, 2008, pp. 578-587.   DOI   ScienceOn
37 Chen, W.-H. and Shih, J.-Y., "A study of Tiwan's issuer credit rating systems using support vector machines," Expert Systems with Applications, Vol. 39, No. 3, 2006, pp. 427-435.
38 Dutta, S. and Shekhar, S., "Bond rating: a non-conservative application of neural networks," In Proceedings of IEEE International Conference on Neural Networks, 1998, pp. II443-II450.
39 Garabaglia, S., "An application of a Counter-Propagation Neural Networks: Simulating the Standard and Poor's Corporate Bond Rating Systems," In Proceedings of the First International Conference on Artificial Intelligence on Wall Street, 1991, pp. 278-287.