DOI QR코드

DOI QR Code

Reject Inference of Incomplete Data Using a Normal Mixture Model

  • Received : 20110200
  • Accepted : 20110300
  • Published : 2011.04.30

Abstract

Reject inference in credit scoring is a statistical approach to adjust for nonrandom sample bias due to rejected applicants. Function estimation approaches are based on the assumption that rejected applicants are not necessary to be included in the estimation, when the missing data mechanism is missing at random. On the other hand, the density estimation approach by using mixture models indicates that reject inference should include rejected applicants in the model. When mixture models are chosen for reject inference, it is often assumed that data follow a normal distribution. If data include missing values, an application of the normal mixture model to fully observed cases may cause another sample bias due to missing values. We extend reject inference by a multivariate normal mixture model to handle incomplete characteristic variables. A simulation study shows that inclusion of incomplete characteristic variables outperforms the function estimation approaches.

Keywords

References

  1. Boyes, W. J., Hoffman, D. L. and Low, S. A. (1989). An econometric analysis of the bank credit scoring problem, Journal of Econometrics, 40, 3-14. https://doi.org/10.1016/0304-4076(89)90026-2
  2. Choi, B. J. (2008). Semi-Supervised learning Based on Independent Gaussian Mixture Models, Ph.D. dissertation, Korea University, Korea.
  3. Copas, J. B. and Li, H. G. (1997). Reject inference for non-random samples, Journal of the Royal Statistical Society, Series B, 20, 55-95.
  4. Feelders, A. J. (1999). Credit scoring and reject inference with mixture models, International Journal of Intelligent Systems in Accounting, Finance & Management, 8, 271-279. https://doi.org/10.1002/(SICI)1099-1174(199912)8:4<271::AID-ISAF170>3.0.CO;2-P
  5. Friedman, J. H. (1997). On bias, variance, 0/1 - loss, and the curse-of-dimensionality, Data Mining and Knowledge Discovery, 1, 55-77. https://doi.org/10.1023/A:1009778005914
  6. Hand, D. J. (1998). Reject Inference in Credit Operation, In: E. Mays ed. Credit Risk Modeling: Design and Application, American Management Association, New York, 181-190.
  7. Hand, D. J. and Henley, W. E. (1994). Inference about Rejected Cases in Discriminant Analysis, In: Diday, E., Lechevallier, Y., Schader, M., Bertrand, P. and Burtschy, B. eds, New Approaches in Classification and Data Analysis, Springer, New York, 292-299.
  8. Hand, D. J. and Henley, W. E. (1997). Statistical classification methods in consumer credit scoring: A review, Journal of the Royal Statistical Society, Series A, 160, 523-541. https://doi.org/10.1111/j.1467-985X.1997.00078.x
  9. Hsai, D. C. (1978). Credit scoring and the equal credit opportunity act, The Hastings Law Journal, 30, 371-448.
  10. Hunt, L. and Jorgensen, M. (2003). Mixture model clustering for mixed data with missing information, Computational Statistics & Data Analysis, 41, 429-440. https://doi.org/10.1016/S0167-9473(02)00190-1
  11. Jacobson, T. and Roszbach, K. F. (2000). Evaluating Bank lending Policy and Consumer Credit Risk, In: Yaser S. Abu-Mostafa et al. eds. Computational Finance 1999, the MIT Press, Cambridge, 535-548.
  12. Joans, D. N. (1993). Reject inference applied to logistic regression for credit scoring, IMA Journal of Mathematics Applied in Business and Industry, 5, 35-43.
  13. Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data, John Wiley, New York.
  14. Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data, Chapman & Hall, New York.