DOI QR코드

DOI QR Code

A Review of Machine Learning Algorithms for Fraud Detection in Credit Card Transaction

  • Received : 2021.09.05
  • Published : 2021.09.30

Abstract

The increasing number of credit card fraud cases has become a considerable problem since the past decades. This phenomenon is due to the expansion of new technologies, including the increased popularity and volume of online banking transactions and e-commerce. In order to address the problem of credit card fraud detection, a rule-based approach has been widely utilized to detect and guard against fraudulent activities. However, it requires huge computational power and high complexity in defining and building the rule base for pattern matching, in order to precisely identifying the fraud patterns. In addition, it does not come with intelligence and ability in predicting or analysing transaction data in looking for new fraud patterns and strategies. As such, Data Mining and Machine Learning algorithms are proposed to overcome the shortcomings in this paper. The aim of this paper is to highlight the important techniques and methodologies that are employed in fraud detection, while at the same time focusing on the existing literature. Methods such as Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), naïve Bayesian, k-Nearest Neighbour (k-NN), Decision Tree and Frequent Pattern Mining algorithms are reviewed and evaluated for their performance in detecting fraudulent transaction.

Keywords

References

  1. R. Marmo, Data Mining for Fraud Detection System. Encyclopedia of Data Warehousing and Mining, 2nd ed, 2013, pp. 411-416.
  2. C. Tyagi, P. Parwekar, P. Singh, and K. Natla, "Analysis of Credit Card Fraud Detection Techniques," Solid State Technology, vol. 63, no. 6, 2020, pp. 18057-18069.
  3. C. Chee, J. Jaafar, I. Aziz, M. Hassan, and W. Yeoh, "Algorithms for frequent itemset mining: a literature review," Artificial Intelligence Review, vol. 52, 2019, pp. 2603-2621. https://doi.org/10.1007/s10462-018-9629-z
  4. C. Ordonez, and K. Zhao, "Evaluating association rules and decision trees to predict multiple target attributes," Intelligent Data Analysis, vol. 15, no. 2, 2011, pp. 173-192. https://doi.org/10.3233/IDA-2010-0462
  5. D. Excell, Bayesian Inference - the Future of Online Fraud Protection. Computer Fraud & Security, 2nd ed., 2012, pp. 8-11.
  6. J. Xu, A. Sung and Q. Liu, "Behaviour Mining for Fraud Detection," Journal of Research and Practice in Information Technology, vol. 39, no. 1, 2007, pp. 3-18.
  7. C. Paasch, Credit Card Fraud Detection using Artificial Neural Networks tuned by Genetic Algorithms, 2014, doi:10.14711/thesis-b1023238
  8. R. Porkess, and S. Mason, "Looking at Debit and Credit Card Fraud," Teaching Statistics, vol. 34, no. 3, 2011, pp. 87-91. https://doi.org/10.1111/j.1467-9639.2010.00437.x
  9. T. Sweer, Autoencoding Credit Card Fraud, Radboud University, 2018, Retrieved from https://www.cs.ru.nl/bachelorstheses/2018/Tom_Sweers___458435___Autoencoding_credit_card_fraude.pdf
  10. S. Yusuf, and D. Ekrem, "Detecting Credit Card Fraud by ANN and Logistic Regression," in Proceedings of the International Symposium on Innovations in Intelligent SysTems and Applications, 2011.
  11. L. Mukhanov, "Using Bayesian Belief Networks for credit card fraud detection," In Proceedings of the Conference: Proceedings of the 26th International Conference on Artificial Intelligence and Applications, 2008, pp. 221-225.
  12. S. Maes, K. Tuyls, B. Vanschoenwinkel, and B. Manderick, "Credit Card Fraud Detection Using Bayesian and Neural Networks," In Proceedings of the First International NAISO Congress on NEURO FUZZY THECHNOLOGIES, 2002.
  13. C. Milgo, "A Bayesian Classification Model for Fraud Detection over ATM Platforms," Journal of Computer Engineering, vol. 18, no. 4, pp. 26-32, 2016.
  14. A. Desai, and D. Deshmukh, "Data mining techniques for Fraud Detection," International Journal of Computer Science and Information Technologies, vol. 3, pp. 1-4, 2013.
  15. K. Seeja, and M. Zareapoor, "FraudMiner: A Novel Credit Card Fraud Detection Model Based on Frequent Itemset Mining," The Scientific World Journal, 2014, pp. 1-10.
  16. Kevin Zakka. (n.d.), A Complete Guide to K-NearestNeighbours with Applications in Python and R, Retrieved from https://kevinzakka.github.io/2016/07/13/k-nearestneighbor
  17. I. Sutedja, Y. Heryadi, L. Wulandhari, and B. Abbas, "Recognizing debit card fraud transaction using CHAID and K-nearest neighbour: Indonesian Bank case," in Proceedings of the 11th International Conference on Knowledge, Information and Creativity Support Systems, 2016, pp. 1-5.
  18. C. Sudha, and T. Raj, "Credit Card Fraud Detection in Internet Using K-nearest Neighbor Algorithm," International Journal of Computer Science, vol. 5, issue 11, pp. 22-30, 2017.
  19. I. Rajak and K. Mathai, "Intelligent Fraudulent Detection System based SVM and Optimized by Danger Theory," in Proceedings of International Conference on Computer, Communication and Control, 2015, pp. 1-4.
  20. j2kun, "Formulating the Support Vector Machine Optimization Problem," 2017 Retrieved from https://jeremykun.com/2017/06/05/formulating-the-supportvector-machine-optimization-problem/
  21. C. Burges, "A tutorial on support vector machines for pattern recognition," Data Mining and Knowledge Discovery, vol. 2, no. 2, 1998, pp. 121-167. https://doi.org/10.1023/A:1009715923555
  22. Y. Sahin, and E. Duman, "Detecting credit card fraud by decision trees & support vector machines," in Proceeding of the International Multi Conference of Engineers & Computer Scientist, vol. I, 2011.
  23. V. Dheepa, and R. Dhanapal, "Behavior Based Credit Card Fraud Detection Using Support Vector Machines," Journal on Soft Computing, vol. 2, no. 4, 2012, pp. 391-397.
  24. Q. Lu, and C. Ju, "Research on Credit Card Fraud Detection Model Based on Class Weighted Support Vector Machine," Journal of Convergence Information Technology, vol. 6, no. 1, 2011, pp. 62-68. https://doi.org/10.4156/jcit.vol6.issue1.8
  25. D. Abdelhamid, S. Khaoula, and O. Atika, "Automatic Bank Fraud Detection Using Support Vector Machines," in Proceedings of the International conference on Computing Technology and Information Management, pp. 10-17, 2014.
  26. R. Porkess, and S. Mason, "Looking at debit and credit card fraud," Teaching Statistics, vol. 34, no. 3, 2011, pp. 87-91. https://doi.org/10.1111/j.1467-9639.2010.00437.x
  27. L. Oghenekaro, and C. Ugwu, "A Novel Machine Learning Approach to Credit Card Fraud Detection," International Journal of Computer Applications, vol. 140, no. 5, 2016, pp.45-50. https://doi.org/10.5120/ijca2016909316
  28. N. Carneiro, G. Figueira, and M. Costa, "A data mining-based system for credit-card fraud detection in e-tail," Decision Support Systems 95, Elsevier B.V, 2017, pp. 91-101.
  29. S. Ong, S. Sagadevan, and N. Malim, "Credit Card Fraud Detection Using Machine Learning As Data Mining Technique," Journal of Telecommunication, Electronic and Computer Engineering, vol. 10, no. 1-4, pp. 23-27, 2014.
  30. CyberSource, "Annual Fraud Benchmark Report: A Balancing Act," North America Edition, 2016.
  31. G. James, D. Witten, T. Hastie, and R. Tibshirani, "An Introduction to Statistical Learning," Springer, 2013, pp. 204.
  32. A. Banarescu, "Detecting and Preventing Fraud with Data Analytics," Procedia Economics and Finance, vol. 32, 2015, pp. 1827-1836. https://doi.org/10.1016/S2212-5671(15)01485-9
  33. B. Zolfaghari, K. Bibak, T. Koshiba, H. Nemati, and P. Mitra, "Statistical trend analysis of physically unclonable functions: An approach via text mining," CRC Press, 2021, pp. 55-74.
  34. W. Loh, "Classification and regression trees," Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 1, no. 1, 2011, pp. 14-23. https://doi.org/10.1002/widm.8
  35. J. Quinlan, "Improved Use of Continuous Attributes in C4.5," Journal of Artificial Intelligence Research, vol. 4, 1996, pp. 77-90. https://doi.org/10.1613/jair.279
  36. B, Hssina, A. Merbouha, H. Ezzikouri, and M. Erritali, "A comparative study of decision tree ID3 and C4.5," International Journal of Advanced Computer Science and Applications, 4(2), 2014, pp. 13-19.
  37. B. Gupta, A. Rawat, A. Jain, and A. Arora, "Analysis of Various Decision Tree Algorithms for Classification in Data Mining," International Journal of Computer Applications,. Vol. 163, no. 8, 2017, pp. 0975 - 8887.
  38. S. Priyanka, "Comparative Study ID3, CART and C4.5 Decision Tree Algorithm: A Survey," International Journal of Advanced Information Science and Technology, vol. 27, no. 27, 2014, pp. 97-103.
  39. Y. Sahin, S. Bulkan, and E. Duman, "A cost-sensitive decision tree approach for fraud detection," Expert Systems with Applications, vol. 40, no. 15, 2013, pp. 5916-5923. https://doi.org/10.1016/j.eswa.2013.05.021
  40. V. Jayasree, and R. Balan, "Money laundering regulatory risk evaluation using Bitmap Index-based Decision Tree," Journal of the Association of Arab Universities for Basic and Applied Sciences, vol. 23, no. 1, 2017, pp. 96-102.
  41. T. Minegishi, and A. Niimi, "Proposal of Credit Card Fraudulent Use Detection by Online-type Decision Tree Construction and Verification of Generality," International Journal for Information Security Research, vol. 3, no. 1, 2013, pp. 229-235. https://doi.org/10.20533/ijisr.2042.4639.2013.0028
  42. O. Aodha, and G. Brostow, "Revisiting Example Dependent Cost-Sensitive Learning with Decision Trees," In Proceedings of the 2013 IEEE International Conference on Computer Vision, 2013, pp. 193-200.
  43. I. Monedero, F. Biscarri, C. Leon, J. Guerrero, J. Biscarri, and R. Millan, "Detection of frauds and other non-technical losses in a power utility using Pearson coefficient, Bayesian networks and decision trees," International Journal of Electrical Power & Energy Systems, vol. 34, no. 1, 2012, pp. 90-98. https://doi.org/10.1016/j.ijepes.2011.09.009
  44. S. Agarwal, "Data Mining: Data Mining Concepts and Techniques," in Proceedings of the 2013 International Conference on Machine Intelligence and Research Advancement, 2013, pp. 203-207.
  45. R. Agrawal, and R. Srikant, "Fast algorithms for mining association rules in large databases," Research Report RJ 9839, 1994, IBM Almaden Research Center, San Jose, California.
  46. C. Ordonez, and K. Zhao, "Evaluating association rules and decision trees to predict multiple target attributes," Intelligent Data Analysis, vol. 15, no. 2, 2011, pp. 173-192. https://doi.org/10.3233/IDA-2010-0462
  47. S. Nasreen, M. Azam, K. Shehzad, U. Naeem, and M. Ghazanfar, "Frequent Pattern Mining Algorithms for Finding Associated Frequent Patterns for Data Streams: A Survey," Procedia Computer Science, vol. 37, 2014, pp. 109-116. https://doi.org/10.1016/j.procs.2014.08.019
  48. D. Excell, "Bayesian inference - the future of online fraud protection," Computer Fraud & Security, vol. 2, 2012, pp. 8-11. https://doi.org/10.1016/S1361-3723(12)70018-0
  49. J. Xu, A. Sung, abd Q. Liu, "Behaviour Mining for Fraud Detection," Journal of Research and Practice in Information Technology, vol. 39, no. 1, 2007, pp. 3-18.
  50. K. Hu, Y. Lu, L. Zhou, and C. Shi, "Integrating Classification and Association Rule Mining: A Concept Lattice Framework," Lecture Notes in Computer Science New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, 2001, pp. 443-447.
  51. B. Liu, Y. Ma, and C. Wong, "Classification Using Association Rules: Weaknesses and Enhancements," Data Mining for Scientific and Engineering Applications Massive Computing, 2001, pp. 591-605.
  52. F. Thabtah, "A review of associative classification mining," The Knowledge Engineering Review, vol. 22, no. 1, 2007, pp. 37-65. https://doi.org/10.1017/S0269888907001026
  53. D. Montague, Essentials of online payment security and fraud prevention, Wiley, 2011, pp. 183-189.
  54. A. Serrano, J. Costa, C. Cardonha, A. Fernandes, and R. Junior, "Neural Network Predictor for Fraud Detection: A Study Case for the Federal Patrimony Department," In Proceedings of the Seventh International Conference on Forensic Computer Science, 2012, pp. 61-66.
  55. S. Viaene, R. Derrig, and G. Dedene,"A case study of applying boosting naive bayes to claim fraud diagnosis," IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 5, 2004, pp. 612-620. https://doi.org/10.1109/TKDE.2004.1277822
  56. S. Kiran, J. Guru, R. Kumar, N. Kumar, D. Katariya, and M. Sharma, "Credit card fraud detection using Naive Bayes model based and KNN classifier," International Jounral of Advance Research, Ideas and Innovations in Technology, vol. 4, 2018, pp. 44-47.
  57. R. Sallehuddin, S. Ibrahim, A. Zain, and A. Elmi, "Detecting SIM Box Fraud by Using Support Vector Machine and Artificial Neural Network," Jurnal Teknologi, vol. 74, no. 1, 2015, pp. 137-149.
  58. R. Banerjee, G. Bourla, S. Chen, M. Kashyap, S. Purohit, and J. Battipaglia, "Comparative Analysis of Machine Learning Algorithms through Credit Card Fraud Detection," in Proceedings of the 2018 IEEE MIT Undergraduate Research Technology Conference, 2018, pp. 1-4.
  59. J. Gaikwad, A. Deshmane, H. Somavanshi, S. Patil, and R. Badgujar, "Credit Card Fraud Detection using Decision Tree Induction Algorithm," International Journal of Innovative Technology and Exploring Engineering (IJITEE), vol. 4, no. 6, 2014, pp. 66-69.
  60. Y. Sahin., and E. Duman, "Detecting Credit Card Fraud by Decision Trees and Support Vector Machines," in Proceedings of the International of MultiConference of Enginners and Computer Scientists, 2011, pp. 442-447.
  61. D. Tripathi, B. Nigam, and D. Edla, "A Novel Web Fraud Detection Technique using Association Rule Mining," Procedia Computer Science, vol. 115, 2017, pp. 274-281. https://doi.org/10.1016/j.procs.2017.09.135
  62. V. Choudhary, and E. Divya, "Credit Card Fraud Detection using Frequent Pattern Mining using FP- Tree And Apriori Growth," International Jounral of Advance Technology and Innovation Research, vol. 09, no. 13, 2017, pp. 2370-2373.
  63. R. Schapire, "Explaining AdaBoost," Empirical Inference, 2013, pp. 37-52.
  64. M. Bansal, and Suman,"Credit Card Fraud Detection Using Self Organised Map," International Journal of Information & Computation Technology, vol. 4, No. 13, 2014, pp. 1343-1348.
  65. H. Naik, "Credit Card Fraud Detection for Online Banking Transactions," International Journal for Research in Applied Science and Engineering Technology, vol. 6, no. 4, 2018, pp. 4573-4577. https://doi.org/10.22214/ijraset.2018.4749
  66. N. Malini, and M. Pushpa, "Investigation of Credit Card Fraud Recognition Techniques based on KNN and HMM," in Proceedings of the International Conference on Communication, Computing and Information Technology, 2018, pp. 9-13.
  67. M. Franzese, and A. Iuliano, "Hidden Markov Models," Encyclopedia of Bioinformatics and Computational Biology, vol. 1, 2019, pp. 753-762.
  68. M. Pietrzykowski, and W. Salabun, "Applications of Hidden Markov Model: state-of-the-art," International of Journal Computer Technology & Applications, vol. 5, no. 4, 2014, pp. 1384-1391
  69. B. Baesens, S. Hoppner, and T. Verdonck, "Data engineering for fraud detection," Decision Support Systems, 2021, article 113492,
  70. X. Zhang, Y. Han, W. Xu, and Q. Wang, "HOBA: A novel feature engineering methodology for credit card fraud detection with a deep learning architecture," Information Sciences, vol. 557, no. 10, 2021, pp. 302-316. https://doi.org/10.1016/j.ins.2019.05.023
  71. P. Craja, A. Kim, and S. Lessmann, "Deep learning for detecting financial statement fraud," Decision Support Systems, vol. 139, 2020, article 113421.