Browse > Article
http://dx.doi.org/10.9708/jksci.2022.27.03.173

A Study on Fraud Detection in the C2C Used Trade Market Using Doc2vec  

Lim, Do Hyun (Graduate School of Business IT, Kookmin University)
Ahn, Hyunchul (Graduate School of Business IT, Kookmin University)
Abstract
In this paper, we propose a machine learning model that can prevent fraudulent transactions in advance and interpret them using the XAI approach. For the experiment, we collected a real data set of 12,258 mobile phone sales posts from Joonggonara, a major domestic online C2C resale trading platform. Characteristics of the text corresponding to the post body were extracted using Doc2vec, dimensionality was reduced through PCA, and various derived variables were created based on previous research. To mitigate the data imbalance problem in the preprocessing stage, a complex sampling method that combines oversampling and undersampling was applied. Then, various machine learning models were built to detect fraudulent postings. As a result of the analysis, LightGBM showed the best performance compared to other machine learning models. And as a result of SHAP, if the price is unreasonably low compared to the market price and if there is no indication of the transaction area, there was a high probability that it was a fraudulent post. Also, high price, no safe transaction, the more the courier transaction, and the higher the ratio of 0 in the price also led to fraud.
Keywords
Fraud Detection; Online C2C Resale Market; Doc2vec; LightGBM; SHAP;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 M. J. Jin, and J. G. Kim, Motivation Factors Affecting the Use of C2C Secondhand Trading Platforms, Proceedings of the Spring Conference of Korean Institute of Industrial Engineers, pp. 2645-2673, Korea, Jun. 2021.
2 Y. H. Lee, H. M. Kou and H. J. Kim, "Efficient Supervised Credit Card Fraud Detection Technique using Autoencoder," The Journal of Korean Institute of Information Scientists and Engineers, Vol. 25, No. 1, pp. 1-8, Jan. 2019. DOI: 10.5626/KTCP.2019.25.1.1   DOI
3 F. Anowar and S. Sadaoui, "Detection of Auction Fraud in Commercial Sites," The Journal of Theoretical and Applied Electronic Commerce Research, Vol. 15, No. 1, pp. 81-98, Jan. 2020. DOI: 10.4067/S0718-18762020000100107   DOI
4 J. H. Hyun, D. Y. Lim and C. Y. Lee, "A proposal on necessity of preventing fraud damage in C2C used trading markets: Focusing on fraud red flags," The Journal of Police Science, Vol. 21, No. 1, pp. 249-272, Korea, Mar. 2021.   DOI
5 G. Ke et al., LightGBM: A Highly Efficient Gradient Boosting Decision Tree, Advances in Neural Information Processing Systems, Vol. 30, pp. 3149-3157, 2017.
6 J. S. Shim, J. J. Lee, I. T. Jeong, and H. C. Ahn, A Study on Korean Fake news Detection Model Using Word Embedding, Proceedings of the Korean Society of Computer Information Conference, Vol. 28, No. 2, pp. 199-202, Korea, July 2020.
7 R. Mohammed, J. Rawashdeh and M. Abdullah, Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results, Proceedings of 2020 11th International Conference on Information and Communication Systems (ICICS), pp. 243-248, 2020. DOI: 10.1109/ICICS49469.2020.239556.   DOI
8 Y. J. Kim and Y. R. Koo, A Study on the Service Design for safe C2C Used Trading - Based on the mobile APP, Proceedings of the Winter Conference of Korean Society of Design Science, pp. 173-174, Korea, Nov. 2019.
9 S. J. Choi, J. W. Lee and O. B. Kwon, "Financial Fraud Detection using Text Mining Analysis against Municipal Cybercriminality," Journal of Intelligence and Information Systems, Vol. 23, No. 3, pp. 119-138, Sep. 2017. DOI: 10.13088/jiis.2017.23.3.119   DOI
10 Q. Le, and T. Mikolov, Distributed Representations of Sentences and Documents, Proceedings of the 31st International Conference on Machine Learning, Vol. 32, No. 2, pp. 1188-1196.
11 S. Choi, J. Seol, and S. G. Lee, On Word Embedding Models and Parameters Optimized for Korean, Proceedings of Annual Conference on Human and Language Technology, pp. 252-256, Oct. 2016.
12 C. S. Wu, F. F. Cheng, and D. C. Yen, "The influence of seller, auctioneer, and bidder factors on trust in online auctions," Journal of Organizational Computing and Electronic Commerce, Vol. 24, No. 1, Jan. 2014, pp. 36-57. DOI: 10.1080/10919392.2014.866502   DOI
13 D. Wang, Y. Zhang, and Y. Zhao, LightGBM: an effective miRNA classification method in breast cancer patients, Proceedings of the 2017 International Conference on Computational Biology and Bioinformatics, pp. 7-11, Oct. 2017. DOI:10.1145/3155077.3155079   DOI
14 S. M.. Lundberg and S. I. Lee, A unified approach to interpreting model predictions, Advances in Neural Information Processing Systems, Vol. 30, pp. 4768-4777, 2017.
15 J. H. Ahn. "XAI: Explainable Artificial Intelligence Dissects Artificial Intelligence," WikiBooks, pp.253-258, 2020
16 G. W. Kim, Junggonara: Last year's transaction amount exceeded 5 trillion won...43% increase from the previous year, Mar. 2021, https://www.news1.kr/articles/?4245907
17 Y. S. Kim, H. S. Moon, and J. K. Kim, "Self Introduction Essay Classification Using Doc2Vec for Efficient Job Matching," The Journal of Information Technology Services, Vol. 19, No. 1, pp. 103-112, Feb. 2020. DOI:10.9716/KITS.2020.19.1.103   DOI
18 S. M. Kim, Platform-based second-hand market fraud is active... 120,000 people lost 89.7 billion won last year, Nov. 2021, https://www.donga.com/news/Economy/article/all/20211124/110445133/1
19 The Cheat, Fraudulent case statistics, https://thecheat.co.kr/rb/?mod=_statistics
20 H. H. Park, "Analysis of Sales Information of Secondhand Clothing Goods on the C2C Secondhand Trading Platform," Fashion & Textile Research Journal, Vol. 23, No. 3, pp. 358-369, Jun. 2021. DOI: 10.5805/SFTI.2021.23.3.358   DOI
21 A. Dimoka,, Y. Hong, and P. A. Pavlou, "On product uncertainty in online markets: Theory and evidence," MIS Quarterly, Vol. 36, No. 2, pp. 395-426, Jun. 2012. DOI: 10.2307/41703461   DOI
22 B. H. Choi and N. W. Cho, "A Study on the Fraud Detection through Sequential Pattern Analysis: Focused on Transactions of Electronic Prepayment," The Journal of Society for e-Business, Vol. 26, No. 3, pp. 21-32, Aug. 2021. DOI: 10.7838/jsebs.2021.26.3.021   DOI
23 W. U. H. Abidi, M. S Daoud and B. Ihnaini, "Real-Time Shill Bidding Fraud Detection Empowered With Fussed Machine Learning," IEEE Access, Vol. 9, pp. 113612-113621, Jun. 2021. DOI:10.1109/ACCESS.2021.3098628   DOI
24 D. W. Lee and J. Y. Min, "A Study on the Fraud Detection in an Online Second-hand Market by Using Topic Modeling and Machine Learning," Information Systems Review, Vol. 23, No. 4, Nov. 2021. DOI:10.14329/isr.2021.23.4.045   DOI
25 K. N. Lee, J. T. Lim, K. Soo Bok, and J. S Yoo, "Handling Method of Imbalance Data for Machine Learning : Focused on Sampling," The Journal of the Korea Contents Association, Vol. 19, No. 11, pp. 567-577, Nov. 2019. DOI:10.5392/JKCA.2019. 19.11.567   DOI
26 J. Tang, S. Alelyani, and H. Liu, "Feature selection for classification: A review," Data classification: Algorithms and applications, Vol. 37, 2014