[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5391/IJFIS.2005.5.3.263

Optimization of Classifier Performance at Local Operating Range: A Case Study in Fraud Detection

Park Lae-Jeong (Department of Electronics Engineering, Kangnung National University)
Moon Jung-Ho (Department of Electronics Engineering, Kangnung National University)

Publication Information

International Journal of Fuzzy Logic and Intelligent Systems / v.5, no.3, 2005 , pp. 263-267 More about this Journal

Abstract

Building classifiers for financial real-world classification problems is often plagued by severely overlapping and highly skewed class distribution. New performance measures such as receiver operating characteristic (ROC) curve and area under ROC curve (AUC) have been recently introduced in evaluating and building classifiers for those kind of problems. They are, however, in-effective to evaluation of classifier's discrimination performance in a particular class of the classification problems that interests lie in only a local operating range of the classifier, In this paper, a new method is proposed that enables us to directly improve classifier's discrimination performance at a desired local operating range by defining and optimizing a partial area under ROC curve or domain-specific curve, which is difficult to achieve with conventional classification accuracy based learning methods. The effectiveness of the proposed approach is demonstrated in terms of fraud detection capability in a real-world fraud detection problem compared with the MSE-based approach.

Keywords

Classifier; ROC; AUC; Learning;

Citations & Related Records

Reference

1	C. Cortes and M. Mohri, 'AUC optimization vs. error rate minimization,' Advances in Neural Information Processing Systems, vol. 15, 2003
2	M. K, Markey, J. Y. Lo, R. Vargas-Woracek, G. D. Tourassi, and C. E. Floyd Jr., 'Perceptron error surface analysis: a case study in breast cancer diagnosis,' Computers in Biology and Medicine, vol. 32, pp. 99-109, 2002 DOI ScienceOn
3	H. B. Mann and D. R. Whitney, 'On a test whether one of two random variables is stochastically larger than the other,' Ann. Math. Statist., vol. 18, pp. 50-60, 1947 DOI ScienceOn
4	L. Yan, R, Dodier, M. Mozer, and R. Wolniewicz, 'Optimizing classifier performance via the Wilcoxon-Mann-Whitney statistics,' Proc. Int'l Conf. Machine Learning, pp. 848-855, 2003
5	P. K. Chan, W. Fan, A. L. Prodromidis, and S. J. Stolfo, 'Distributed data mining in credit card fraud detection,' IEEE Intelligent Systems, vol. 14, 57-74, 1999
6	R. J. Bolton and D. J. Hand, 'Statistical fraud detection: A review,' Statistical Science, vol. 17, pp. 235-255, 2002 DOI ScienceOn
7	B. Sahiner, H.-P. Chan, N. Petrick, S. S. Gopal, and M. M. Goodsitt, 'Neural network design for optimization of the partial area under the receiver operating characteristic curve,' Proc. of IEEE Int. Conf. on Neural Networks, 1997 DOI
8	A. Bradley, 'The use of the area under the ROC curve in the evaluation of machine learning algorithms,' Pattern Recognition, vol 30, pp. 1145-1159, 1997 DOI ScienceOn
9	F. Provost and T. Fawcett, 'Analysis and visualization of classifier performance comparison under imprecise class and cost distributions,' Proc. Int'l Conf. Knowledge Discovery and Data Mining, pp. 43-48, 1997
10	H. Verrelst, Y. Moreau, J. Vanderwalle, and D. Timmerman, 'Use of a multi-layer perceptron to predict malignancy in ovarian tumors,' Advances in Neural Information Processing Systems, vol. 10, 1998