Browse > Article
http://dx.doi.org/10.13089/JKIISC.2018.28.6.1499

Black Consumer Detection in E-Commerce Using Filter Method and Classification Algorithms  

Lee, Taekyu (Institute of Cyber Security & Privacy (ICSP), Korea University)
Lee, Kyung Ho (Institute of Cyber Security & Privacy (ICSP), Korea University)
Abstract
Although fast-growing e-commerce markets gave a lot of companies opportunities to expand their customer bases, it is also the case that there are growing number of cases in which the so-called 'black consumers' cause much damage on many companies. In this study, we will implement and optimize a machine learning model that detects black consumers using customer data from e-commerce store. Using filter method for feature selection and 4 different algorithms for classification, we could get the best-performing machine learning model that detects black consumer with F-measure 0.667 and could also yield improvements in performance which are 11.44% in F-measure, 10.51% in AURC, and 22.87% in TPR.
Keywords
Machine Learning; Supervised Learning; Fraud Detection; User Classification; Feature Selection;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Garner, Stephen R., "Weka: The waikato environment for knowledge analysis", In Proceedings of the New Zealand computer science research students conference, pp. 57-64. May 1995
2 Vipin K., et al., "Feature selection : a literature review", SmartComputing Review, vol. 4, no. 3, Jun. 2014
3 Guyon, Isabelle, and Andre Elisseeff., "An introduction to feature extraction", Feature extraction, Studies in Fuzziness and Soft Computing, vol, 207, pp. 1-25, 2006
4 Ghahramani, Zoubin, and Michael I. Jordan., "Supervised learning from incomplete data via an EM approach", In Advances in neural information processing systems, pp. 120-127, 1994
5 Breiman, L., "Random forests", Machine learning, vol.45, no.1, pp.5-32, Oct. 2001   DOI
6 Bhargava, N., Sharma, G., Bhargava, R., & Mathuria, M., "Decision tree analysis on j48 algorithm for data mining", Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering, vol. 3, no. 6, Jun 2013
7 Kohavi, R., "A study of cross-validation and bootstrap for accuracy estimation and model selection", the International Joint Conference on Articial Intelligence (Ijcai), vol. 14, no. 2, pp. 1137-1145, Aug. 1995
8 Patil, T.R. and Sherekar, S.S., "Performance analysis of Naive Bayes and J48 classification algorithm for data classification", International journal of computer science and applications, vol. 6, no. 2, pp.256-261, Apr. 2013
9 Dimitoglou, G., Adams, J.A. and Jim, C.M., "Comparison of the C4. 5 and a Naive Bayes classifier for the prediction of lung cancer survivability", arXiv preprint arXiv: 1206.1121, Jun. 2012
10 Platt, J.C., "12 fast training of support vector machines using sequential minimal optimization", Advances in kernel methods, pp.185-208, Aug. 1999
11 Davis, J. and Goadrich, M., "The relationship between Precision-Recall and ROC curves", In Proceedings of the 23rd international conference on Machine learning, pp. 233-240, Jun. 2006
12 Hanley, J.A. and McNeil, B.J., "The meaning and use of the area under a receiver operating characteristic (ROC) curve", Radiology, vol. 143, no. 1, pp.29-36, Apr. 1982   DOI
13 Sasaki, Y., "The truth of the F-measure", Teach Tutor mater, vol.1, no.5, pp.1-5, Oct. 2007
14 Jae Wook Shin, Min Cheol Shin, "The Effects of Consumers' Psychological Characteristics on Dysfunctional Consumer Behavior and Life Satisfaction", The Korean Journal of Consumer and Advertising Psychology, 15(3), pp. 409-433, Aug. 2014   DOI
15 Chandola, V., Banerjee, A. and Kumar, V., "Anomaly detection: A survey", ACM computing surveys (CSUR), vol. 41, no. 3, p.15, Jul. 2009
16 Tae-ho Kim and Kyung-ho Lee, "Feature Selection Optimization in Unsupervised Learning for Insider Threat Detection", KSII The 13th Asia Pacific International Conference on Information Science and Technology (APIC-IST), June 2018
17 Srivastava, A., Kundu, A., Sural, S. and Majumdar, A., "Credit card fraud detection using hidden Markov model", IEEE Transactions on dependable and secure computing, vol. 5, no. 1, pp.37-48, Jan. 2008   DOI
18 Ahmed, M., Mahmood, A. N., & Hu, J., "A survey of network anomaly detection techniques. Journal of Network and Computer Applications", vol. 60, pp. 19-31., Jan. 2016   DOI
19 Lee, Hojin, et al., "Feature Selection Practice For Unsupervised Learning of Credit Card Fraud Detection", Journal of Theoretical & Applied Information Technology, vol. 96, no. 2, pp. 408-417, Jan, 2018
20 Maes, S., Tuyls, K., Vanschoenwinkel, B. and Manderick, B., "Credit card fraud detection using Bayesian and neural networks", In Proceedings of the 1st international naiso congress on neuro fuzzy technologies, pp. 261-270, Jan. 2002
21 Guo-en Xia, Wei-dong Jin, "Model of customer churn prediction on support vector machine", Systems Engineering-Theory & Practice, vol.28, no.1, pp. 71-77, Sep. 2008
22 Vafeiadis, T., Diamantaras, K. I., Sarigiannidis, G., Chatzisavvas, K. C., "A comparison of machine learning techniques for customer churn prediction", Simulation Modelling Practice and Theory, vol. 55, pp. 1-9, Jun. 2015   DOI
23 Coussement, K., Lessmann, S., Verstraeten, G., "A comparative analysis of data preparation algorithms for customer churn prediction: A case study in the telecommunication industry", Decision Support Systems, vol. 95, pp. 27-36, Mar. 2017   DOI
24 Stafford, Richard G., et al., "Application of neural networks as an aid in medical diagnosis and general anomaly detection", U.S. Patent No 5,331,550, 1994