Browse > Article

Undecided inference using logistic regression for credit evaluation  

Hong, Chong-Sun (Department of Statistics, Sungkyunkwan University)
Jung, Min-Sub (Research Institute of Applied Statistics, Sungkyunkwan University)
Publication Information
Journal of the Korean Data and Information Science Society / v.22, no.2, 2011 , pp. 149-157 More about this Journal
Abstract
Undecided inference could be regarded as a missing data problem such as MARand MNAR. Under the assumption of MAR, undecided inference make use of logistic regression model. The probability of default for the undecided group is obtained with regression coefficient vectors for the decided group and compare with the probability of default for the decided group. And under the assumption of MNAR, undecide dinference make use of logistic regression model with additional feature random vector. Simulation results based on two kinds of real data are obtained and compared. It is found that the misclassification rates are not much different from the rate of rawdata under the assumption of MAR. However the misclassification rates under the assumption of MNAR are less than those under the assumption of MAR, and as the ratio of the undecided group is increasing, the misclassification rates is decreasing.
Keywords
Confusion matrix; logistic model; misclassification rate; missing data; probability of default;
Citations & Related Records
Times Cited By KSCI : 5  (Citation Analysis)
연도 인용수 순위
1 Little, R. J. A. and Rubin, D. B. (1987). Statistical analysis with missing data, Wiley, New York.
2 Pepe, M. S. (1998). Three approaches to regression analysis of receiver operating characteristic curves for continuous test results. Biometrics, 54, 124-135.   DOI   ScienceOn
3 Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction, University Press, Oxford.
4 Ananda, B. W. (2010). Receiver operating characteristic curves for measuring the quality of decisions in cricket. Journal of Quantitative Analysis in Sports, 6, Article 8.
5 Feelders, A. J. (2000). Credit scoring and reject inference with mixture models. International Journal of Intelligent System in Accounting, 8, 271-279.
6 Hand, D. J. (2001). Reject inference in credit operations. Handbook of Credit Scoring, 225-240.
7 홍종선, 김지훈 (2009). 신용평가모형에서 두 분포함수의 동일성 검정을 위한 비모수적인 검정방법. <한국데이터정보과학회지>, 20, 261-272.
8 Kim, H. J. (2002). Analysis of incomplete data with nonignorable missing values. Journal of the Korean Data & Information Science Society, 13, 167-174.
9 Kim, K. S. and Lee, C. S. (2003). A study of data mining optimization model for the credit evaluation. Journal of the Korean Data & Information Science Society, 14, 825-836.
10 홍종선, 권태완 (2010). 수익률 분포의 적합과 리스크값 추정. <한국데이터정보과학회지>, 21, 219-229.
11 홍종선, 최진수 (2009). ROC와 CAP 곡선에서의 최적분류점. <응용통계연구>, 22, 911-921.