Browse > Article
http://dx.doi.org/10.9717/kmms.2019.22.6.695

Realtime Word Filtering System against Variations of Censored Words in Korean  

Kim, ChanWoo (Department of Computer Science & Engineering, Graduate School, Incheon National University)
Sung, Mee Young (Department of Computer Science & Engineering, Incheon National University)
Publication Information
Abstract
The level of psychological damage caused by verbal abuse among cyberbully victims is very serious. It is going to introduce a system that determines the level of sanctions against chatting in real time using the automatic prohibited words filtering based on artificial neural network. In this paper, we propose a keyword filtering method that detects the modified prohibited words and determines whether the corresponding chat should be sanctioned in real time, and a real-time chatting screening system using it. The accuracy of filtering through machine learning was improved by processing data in advance through coding techniques that express consonants and vowels of similar pronunciation at close distances. After comparing and analyzing Mahalanobis-based clustering algorithms and artificial neural network-based algorithms, algorithms that utilize artificial neural networks showed high performance. If it is applied to Internet chatting, comments or online games, it is expected that it will be able to filter more effectively than the existing filtering method and that this will ease communication inconvenience due to existing indiscriminate filtering methods.
Keywords
Filtering Vulgar Words; Online Chat Censorship; Korean Encoding; Natural Language Processing; Artificial Neural Network;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Korea Internet & Security Agency, 2011 Internet Ethical Culture Survey Results, 2011.
2 T.J. Yoon and H.G. Cho, "A Filtering System for On-line Vulgar Words Using Korean Syllable Alignment," Journal of the Korean Institute of Communication Sciences, Vol. 36, No. 2C, pp. 194-198, 2009.
3 M.A. Hearst and S.T. Dumais, "Support Vector Machines." IEEE Intelligent Systems and their Applications, Vol. 13, No. 4, pp. 18-28, 1998.   DOI
4 K.H. Park and J.H. Lee, "Developing a Vulgarity Filtering System for Online Games Using SVM," Journal of the Korean Institute of Communication Sciences, Vol. 33, No. 2B, pp. 260-263, 2006.
5 T.C. Jo, "The Comparison of Neural Network and k - NN Algorithm for News Article Classification," Journal of the Korean Institute of Communication Sciences, Vol. 25, No. 2II, pp. 363-365, 1998.
6 S.I. Seo and S.B. Cho, "A Transfer Learning Method for Solving Imbalance Data of Abusive Sentence Classification," Journal of Korean Institute of Information Scientists and Engineers, Vol. 44, No. 12, pp. 1275-1281, 2017.
7 S.B. Ou and J.W. Lee, "Implementation of a Spam Message Filtering System Using Sentence Similarity Measurements," Korean Institute of Information Scientists and Engineers Transactions on Computing Practices, Vol. 23, No. 1, pp. 57-64, 2017.
8 S.B. Lee, Y.H. Son, H.C. Jang, and K.C. Lee, "The Development of the Korean Medicine Symptom Diagnosis System Using Morphological Analysis to Refine Difficult Medical Terminology," Korean Institute of Information Scientists and Engineers Transactions on Computing Practices, Vol. 22, No. 2, pp. 77-82, 2016.
9 W.R. Park and T.K. Park, "Design and Implementation of Channel Filtering System based on TV Watching Patterns," Journal of Korea Multimedia Society, Vol. 13, No. 10, pp. 1413-1422, 2010.
10 W.B. Cavnar and J.M. Trenkle, "N-grambased Text Categorization," Ann Arbor mi , Vol. 48113, No. 2, pp. 161-175, 1994.
11 Korea Game Industry Agency, Study of Game Language Restoration Guidelines, 2008.
12 Ajith Abraham, Artificial Neural Networks, Handbook of Measuring System Design, (John Wiley & Sons, Ltd, Hoboken, NJ, 2005)
13 Y.Z. Cheng, "Mean Shift, Mode Seeking, and Clustering." IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 8, pp. 790-799, 1995.   DOI
14 Anirudh Harisinghaney, Aman Dixit, Saurabh Gupta, and Anuja Arora, "Text and Image Based Spam Email Classification Using KNN, Naïve Bayes and Reverse DBSCAN Algorithm," Proceeding of 2014 International Conference on Reli-ability Optimization and Information Technology, pp. 153-155, 2014.
15 J. Schmidhuber, "Deep Learning in Neural Networks: An Overview," Neural Networks, Vol. 61, pp. 85-117, 2015.   DOI
16 Setting the Learning Rate of Your Neural Network, https://www.jeremyjordan.me/nnlearning-rate/ (accessed Nov., 7, 2018).
17 Artificial neural network - Wikipedia(2019). https://en.wikipedia.org/wiki/Artificial_neural_network (accessed Feb., 21, 2019).
18 J. Han and C. Moraga, "The Influence of the Sigmoid Function Parameters on the Speed of Backpropagation Learning," International Workshop on Artificial Neural Networks, pp. 195-201, 1995.