Browse > Article
http://dx.doi.org/10.9708/jksci.2021.26.04.093

A Classification Model for Illegal Debt Collection Using Rule and Machine Learning Based Methods  

Kim, Tae-Ho (Graduate School of Information Security, Korea University)
Lim, Jong-In (Graduate School of Information Security, Korea University)
Abstract
Despite the efforts of financial authorities in conducting the direct management and supervision of collection agents and bond-collecting guideline, the illegal and unfair collection of debts still exist. To effectively prevent such illegal and unfair debt collection activities, we need a method for strengthening the monitoring of illegal collection activities even with little manpower using technologies such as unstructured data machine learning. In this study, we propose a classification model for illegal debt collection that combine machine learning such as Support Vector Machine (SVM) with a rule-based technique that obtains the collection transcript of loan companies and converts them into text data to identify illegal activities. Moreover, the study also compares how accurate identification was made in accordance with the machine learning algorithm. The study shows that a case of using the combination of the rule-based illegal rules and machine learning for classification has higher accuracy than the classification model of the previous study that applied only machine learning. This study is the first attempt to classify illegalities by combining rule-based illegal detection rules with machine learning. If further research will be conducted to improve the model's completeness, it will greatly contribute in preventing consumer damage from illegal debt collection activities.
Keywords
Illegal Debt Collection; Machine Learning; Classification Machine Learning; Text Mining; Document classification; Support Vector Machine;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Azam, Nouman, and JingTao Yao. "Comparison of term frequency and document frequency based feature selection metrics in text categorization." Expert Systems with Applications 39.5 (2012): 4760-4768.   DOI
2 Pranckevicius, Tomas, and Virginijus Marcinkevicius. "Comparison of naive bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification." Baltic Journal of Modern Computing 5.2 (2017): 221.
3 JuHyun Kim, Jung-Im Won, "Discrimination Model On Misselling of Financial Products Using Deep Learning" The Korean Institute of Information Scientists and Engineers, 25.6 (2019): 294-302.
4 Longadge, Rushi, and Snehalata Dongre. "Class imbalance problem in data mining review." arXiv preprint arXiv:1305.1707 (2013)
5 Wei, Wei, et al. "Effective detection of sophisticated online banking fraud on extremely imbalanced data." World Wide Web 16.4 (2013): 449-475.   DOI
6 Li, Chenbin, Guohua Zhan, and Zhihua Li. "News text classification based on improved Bi-LSTM-CNN." 2018 9th International Conference on Information Technology in Medicine and Education (ITME). IEEE, 2018.
7 Financial supervisory service press release, "Operation Performance of the Financial Supervisory Service Illegal Private Financial Reporting Center in the first half of the year 19", 2019.9.17.
8 Chen, L., Guo, G., & Wang, K. (2011). Class-dependent projection based method for text categorization. Pattern Recognition Letters, 32, 1493-1501.   DOI
9 Ikonomakis, M., Sotiris Kotsiantis, and V. Tampakas. "Text classification using machine learning techniques." WSEAS transactions on computers 4.8 (2005): 966-974.
10 Sun, Aixin, Ee-Peng Lim, and Ying Liu. "On strategies for imbalanced text classification using SVM: A comparative study." Decision Support Systems 48.1 (2009): 191-201.   DOI
11 Kumar, Sudhanshu, Mahendra Yadava, and Partha Pratim Roy. "Fusion of EEG response and sentiment analysis of products review to predict customer satisfaction." Information Fusion 52 (2019): 41-52.   DOI
12 Grabner, Dietmar, et al. "Classification of customer reviews based on sentiment analysis." ENTER. 2012.
13 Won-Kyung Lee, Min-Ju Lee, DongSu Seo, "Application of Machine Learning Techniques for the Classification of Source Code Vulnerability", Korea Institute Of Information Security And Cryptology, Vol 30, pp. 735-743, 2020.8.
14 Sukjae Choi, Jungwon Lee, Ohbyung Kwon, "Financial Fraud Detection using Text Mining Analysis against Municipal Cyber criminality" Korea Intelligent Information Systems Society 23.3 (2017): 119-138.
15 Wei, Fusheng, et al. "Empirical study of deep learning for text classification in legal document review." 2018 IEEE International Conference on Big Data (Big Data). IEEE, 2018.
16 Li, Penghua, et al. "Law text classification using semi-supervised convolutional neural networks." 2018 Chinese Control and Decision Conference (CCDC). IEEE, 2018.