Browse > Article
http://dx.doi.org/10.7465/jkdi.2012.23.2.235

A credit classification method based on generalized additive models using factor scores of mixtures of common factor analyzers  

Lim, Su-Yeol (Department of Statistics, Chonnam National University)
Baek, Jang-Sun (Department of Statistics, Chonnam National University)
Publication Information
Journal of the Korean Data and Information Science Society / v.23, no.2, 2012 , pp. 235-245 More about this Journal
Abstract
Logistic discrimination is an useful statistical technique for quantitative analysis of financial service industry. Especially it is not only easy to be implemented, but also has good classification rate. Generalized additive model is useful for credit scoring since it has the same advantages of logistic discrimination as well as accounting ability for the nonlinear effects of the explanatory variables. It may, however, need too many additive terms in the model when the number of explanatory variables is very large and there may exist dependencies among the variables. Mixtures of factor analyzers can be used for dimension reduction of high-dimensional feature. This study proposes to use the low-dimensional factor scores of mixtures of factor analyzers as the new features in the generalized additive model. Its application is demonstrated in the classification of some real credit scoring data. The comparison of correct classification rates of competing techniques shows the superiority of the generalized additive model using factor scores.
Keywords
Credit classification; generalized additive model; logistic regression; mixtures of common factor analyzers;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 한성실, 정기문 (2004). 로지스틱 회귀모형을 이용한 채택확률모형. <한국자료분석학회>, 6, 1153-1161.
2 홍종선, 정민섭 (2011). 신용평가에서 로지스틱회귀를 이용한 미결정자 추론. <한국데이터정보과학회지>, 22, 149-157.
3 Baek, J., McLachlan, G. J. and Flack, L. (2010). Mixtures of factor analyzers with common factor loadings: applications to the clustering and visualisation of high-dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1298-1309.   DOI   ScienceOn
4 Baesens, B., Van Gestel, T., Viaene, S., Stepanova, M., Suykens, J. and Vanthienen, J. (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54, 627-635.   DOI   ScienceOn
5 Berkson, J. (1951). Why I prefer logits to probits. Biometrics, 7, 327-339.   DOI   ScienceOn
6 Brooks, C. A., Clark, R. R., Hadgu, A. and Jones, A. M. (1988). The robustness of the logistic risk functions. Communication in Statistics, Simulation, 17, 1-24.
7 Ghahramani, Z. and Hinton, G. E. (1996). The EM algorithm for mixture of factor analyzers, Technical Report CRG-TR-96-1, 8, University of Toronto, Canada.
8 Lin, D. Y., Wei, L. J. and Ying, Z. (2002). Model-checking techniques based on cumulative residuals. Biometrics, 58, 1-12.   DOI   ScienceOn
9 Press, S. R. and Wilson, S. (1978). Choosing between logistic regression and discriminant analysis. Journal of the American Statistical Association, 73, 669-705.
10 기승도, 강기훈 (2010). 일반화가법모형에서 축소방법의 적용연구. <응용통계연구>, 23, 207-218.
11 구자용, 최대우, 최민성 (2005). 스플라인을 이용한 신용 평점화. <응용통계연구>, 18, 543-553.