DOI QR코드

DOI QR Code

Optimization of Classifier Performance at Local Operating Range: A Case Study in Fraud Detection

  • Park Lae-Jeong (Department of Electronics Engineering, Kangnung National University) ;
  • Moon Jung-Ho (Department of Electronics Engineering, Kangnung National University)
  • Published : 2005.09.01

Abstract

Building classifiers for financial real-world classification problems is often plagued by severely overlapping and highly skewed class distribution. New performance measures such as receiver operating characteristic (ROC) curve and area under ROC curve (AUC) have been recently introduced in evaluating and building classifiers for those kind of problems. They are, however, in-effective to evaluation of classifier's discrimination performance in a particular class of the classification problems that interests lie in only a local operating range of the classifier, In this paper, a new method is proposed that enables us to directly improve classifier's discrimination performance at a desired local operating range by defining and optimizing a partial area under ROC curve or domain-specific curve, which is difficult to achieve with conventional classification accuracy based learning methods. The effectiveness of the proposed approach is demonstrated in terms of fraud detection capability in a real-world fraud detection problem compared with the MSE-based approach.

Keywords

References

  1. F. Provost and T. Fawcett, 'Analysis and visualization of classifier performance comparison under imprecise class and cost distributions,' Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 43-48, 1997
  2. A. Bradley, 'The use of the area under the ROC curve in the evaluation of machine learning algorithms,' Pattern Recognition, vol 30, pp. 1145-1159, 1997 https://doi.org/10.1016/S0031-3203(96)00142-2
  3. H. Verrelst, Y. Moreau, J. Vanderwalle, and D. Timmerman, 'Use of a multi-layer perceptron to predict malignancy in ovarian tumors,' Advances in Neural Information Processing Systems, vol. 10, 1998
  4. L. Yan, R, Dodier, M. Mozer, and R. Wolniewicz, 'Optimizing classifier performance via the Wilcoxon-Mann-Whitney statistics,' Proc. Int'l Conf. Machine Learning, pp. 848-855, 2003
  5. B. Sahiner, H.-P. Chan, N. Petrick, S. S. Gopal, and M. M. Goodsitt, 'Neural network design for optimization of the partial area under the receiver operating characteristic curve,' Proc. of IEEE Int. Conf. on Neural Networks, 1997 https://doi.org/10.1109/ICNN.1997.614545
  6. C. Cortes and M. Mohri, 'AUC optimization vs. error rate minimization,' Advances in Neural Information Processing Systems, vol. 15, 2003
  7. M. K, Markey, J. Y. Lo, R. Vargas-Woracek, G. D. Tourassi, and C. E. Floyd Jr., 'Perceptron error surface analysis: a case study in breast cancer diagnosis,' Computers in Biology and Medicine, vol. 32, pp. 99-109, 2002 https://doi.org/10.1016/S0010-4825(01)00035-X
  8. H. B. Mann and D. R. Whitney, 'On a test whether one of two random variables is stochastically larger than the other,' Ann. Math. Statist., vol. 18, pp. 50-60, 1947 https://doi.org/10.1214/aoms/1177730491
  9. R. J. Bolton and D. J. Hand, 'Statistical fraud detection: A review,' Statistical Science, vol. 17, pp. 235-255, 2002 https://doi.org/10.1214/ss/1042727940
  10. P. K. Chan, W. Fan, A. L. Prodromidis, and S. J. Stolfo, 'Distributed data mining in credit card fraud detection,' IEEE Intelligent Systems, vol. 14, 57-74, 1999