Browse > Article
http://dx.doi.org/10.7472/jksii.2020.21.1.79

Knowledge Graph-based Korean New Words Detection Mechanism for Spam Filtering  

Kim, Ji-hye (Dept. of Software, Gachon University)
Jeong, Ok-ran (Dept. of Software, Gachon University)
Publication Information
Journal of Internet Computing and Services / v.21, no.1, 2020 , pp. 79-85 More about this Journal
Abstract
Today, to block spam texts on smartphone, a simple string comparison between text messages and spam keywords or a blocking spam phone numbers is used. As results, spam text is sent in a gradually hanged way to prevent if from being automatically blocked. In particular, for words included in spam keywords, spam texts are sent to abnormal words using special characters, Chinese characters, and whitespace to prevent them from being detected by simple string match. There is a limit that traditional spam filtering methods can't block these spam texts well. Therefore, new technologies are needed to respond to changing spam text messages. In this paper, we propose a knowledge graph-based new words detection mechanism that can detect new words frequently used in spam texts and respond to changing spam texts. Also, we show experimental results of the performance when detected Korean new words are applied to the Naive Bayes algorithm.
Keywords
Spam Filtering; Spam Detection; New Words Detection; Knowledge Graph; ConceptNet;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Narayan, Akshay, and Prateek Saxena, "The curse of 140 characters: evaluating the efficacy of SMS spam detection on android", Proceedings of the Third ACM workshop on Security and privacy in smartphones & mobile devices. ACM, 2013. https://doi.org/10.1145/2516760.2516772
2 Wang, Zhen, et al, "Knowledge graph embedding by translating on hyperplanes", Twenty-Eighth AAAI conference on artificial intelligence, 2014. https://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8531/8546
3 Lin, Yankai, et al, "Learning entity and relation embeddings for knowledge graph completion", Twenty-ninth AAAI conference on artificial intelligence, 2015. https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9571/9523
4 Speer, Robert, Joshua Chin, and Catherine Havasi, "Conceptnet 5.5: An open multilingual graph of general knowledge", Thirty-First AAAI Conference on Artificial Intelligence, 2017. https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14972/14051
5 Speer, Robert, and Catherine Havasi, "Representing General Relational Knowledge in ConceptNet 5", LREC, 2012. http://lrec-conf.org/proceedings/lrec2012/pdf/1072_Paper.pdf
6 Karami, Amir, and Lina Zhou, "Improving static SMS spam detection by using new content-based features", Twentieth Americas Conference on Information Systems, Savannah, 2014. https://aisel.aisnet.org/cgi/viewcontent.cgi?article=1205&context=amcis2014
7 Han-Cheol Cho, and Geun-Sik Jo, "Spam-mail Filtering System Using Naive Bayesian Classifier and Message Rule", Proceedings of the KISS conference, 한국정보과학회 학술발표논문집, Vol.29, No.1B, pp.223-225, 2002. http://www.dbpia.co.kr/pdf/pdfView.do?nodeId=NODE00612490
8 Xiang, Yang, Morshed Chowdhury, and Shawkat Ali, "Filtering mobile spam by support vector machine", CSITeA'04: Third International Conference on Computer Sciences, Software Engineering, Information Technology, E-Business and Applications. International Society for Computers and Their Applications (ISCA), 2004. http://hdl.handle.net/10536/DRO/DU:30005386
9 Gomez Hidalgo, Jose Maria, et al, "Content based SMS spam filtering", Proceedings of the 2006 ACM symposium on Document engineering. ACM, 2006. https://doi.org/10.1145/1166160.1166191
10 Duan, Longzhen, Nan Li, and Longjun Huang, "A new spam short message classification", 2009 First International Workshop on Education Technology and Computer Science, Vol.2, pp.168-171, 2009. https://doi.org/10.1109/ETCS.2009.299
11 Liu, Wuying, and Ting Wang, "Index-based online text classification for sms spam filtering", Journal of Computers, Vol.5, No.6, pp.844-851, 2010. https://doi.org/10.4304/jcp.5.6.844-851
12 Huang, Jie, Bei Huang, and Wenjing Pu, "A Bayesian approach for text filter on 3G network", 2010 6th International Conference on Wireless Communications Networking and Mobile Computing (WiCOM). IEEE, 2010. https://doi.org/10.1109/WICOM.2010.5601282
13 Roy, Pradeep Kumar, Jyoti Prakash Singh, and Snehasish Banerjee. "Deep learning to filter SMS Spam.", Future Generation Computer Systems, Vol.102, pp.524-533, 2020. https://doi.org/10.1016/j.future.2019.09.001   DOI
14 Dhavale, Sunita. "C-ASFT: Convolutional Neural Networks-Based Anti-spam Filtering Technique.", Proceeding of International Conference on Computational Science and Applications. Springer, Singapore, pp.49-55, 2020. https://doi.org/10.1007/978-981-15-0790-8_6
15 H. S. Ahn, "Safetimes", http://www.safetimes.co.kr/news/articleView.html?idxno=76901
16 Ezpeleta, Enaitz, et al. "Novel email spam detection method using sentiment analysis and personality recognition.", Logic Journal of the IGPL, Vol.28, No.1, pp.83-94, 2020. https://doi.org/10.1093/jigpal/jzz073   DOI
17 Venkatraman, S., B. Surendiran, and P. Arun Raj Kumar. "Spam e-mail classification for the internet of things environment using semantic similarity approach.", The Journal of Supercomputing, Vol.76. No.2, pp.756-776, 2020. https://doi.org/10.1007/s11227-019-02913-7   DOI
18 Sharmin, Tazmina, et al. "Convolutional neural networks for image spam detection.", Information Security Journal: A Global Perspective pp.1-15, 2020. https://doi.org/10.1080/19393555.2020.1722867
19 Rojas-Galeano, Sergio A, "Revealing non-alphabetical guises of spam-trigger vocables", Dyna, Vol.80, No.182, pp.15-24, 2013. http://ref.scielo.org/k4w22k
20 Y. E. Jo, "NewsLite", http://www.newsgg.net/mobile/article.html?no=13807
21 Joe, In-Whee, and Hye-Taek Shim, "A SVM-based spam filtering system for short message service (SMS)", The Journal of Korean Institute of Communications and Information Sciences, Vol.34, No.9, pp.908-913, 2009. http://www.koreascience.or.kr/article/JAKO200933063799701.page
22 Kang, Seung-Shik, "A Normalization Method of Distorted Korean SMS Sentences for Spam Message Filtering", KIPS Transactions on Software and Data Engineering, Vol.3, No.7, pp.271-276, 2014. https://doi.org/10.3745/KTSDE.2014.3.7.271   DOI