[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9708/jksci.2021.26.10.157

Deep Learning-based Target Masking Scheme for Understanding Meaning of Newly Coined Words

Nam, Gun-Min (Graduate School of Business IT, Kookmin University)
Kim, Namgyu (Graduate School of Business IT, Kookmin University)

Publication Information

Journal of the Korea Society of Computer and Information / v.26, no.10, 2021 , pp. 157-165 More about this Journal

Abstract

Recently, studies using deep learning to analyze a large amount of text are being actively conducted. In particular, a pre-trained language model that applies the learning results of a large amount of text to the analysis of a specific domain text is attracting attention. Among various pre-trained language models, BERT(Bidirectional Encoder Representations from Transformers)-based model is the most widely used. Recently, research to improve the performance of analysis is being conducted through further pre-training using BERT's MLM(Masked Language Model). However, the traditional MLM has difficulties in clearly understands the meaning of sentences containing new words such as newly coined words. Therefore, in this study, we newly propose NTM(Newly coined words Target Masking), which performs masking only on new words. As a result of analyzing about 700,000 movie reviews of portal 'N' by applying the proposed methodology, it was confirmed that the proposed NTM showed superior performance in terms of accuracy of sensitivity analysis compared to the existing random masking.

Keywords

Target Masking; Deep Learning; BERT; Newly Coined Words; Sentiment Analysis;

Citations & Related Records

Reference

1	B. Gretarsson, J.O. Donovan, S. Bostandjiev, T. Hollerer, A. Asuncion, D. Newman, and P. Smyth, "TopicNets : Visual Analysis of Large Text Corpora with Topic Modeling," ACM Transactions on Intelligent Systems and Technology, Vol. 3, No. 2, pp. 1-26, Feb, 2012.
2	W. Gao, P. Li, and K. Darwish, "Joint Topic Modeling for Event Summarization Across News and Social Media Streams," Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1173-1182, Nov, 2012.
3	B. Liu, "Sentiment Analysis and Opinion Mining," Synthesis Lectures on Human Language Technologies, Vol. 5, No. 1, pp. 1-167, May, 2012. DOI
4	M. Kim, N. Kim, "Text Augmentation Using Hierarchy-based Word Replacement," Journal of The Korea Society of Computer and Information, Vol. 26, No. 1, pp. 57-67, Jan, 2021. DOI
5	Q. Le, T. Mikolov, "Distributed Representations of Sentences and Documents," Proceedings of the 31st International Conference on Machine Learning, Vol. 32, pp. 1188-1196, May, 2014.
6	D. Araci, "FinBERT: Financial Sentiment Analysis with Pre-trained Language Models," arXiv:1908.10063, Aug, 2019.
7	SKTBrain, https://github.com/SKTBrain/KoBERT
8	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, and I. Polosukhin, "Attention Is All You Need," Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000-6010, 2017.
9	Y. Yun, N. Kim, "Self-Supervised Document Representation Method," Journal of The Korea Society of Computer and Information, Vol. 25, No. 5, pp.187-197, May, 2020. DOI
10	V. D. Viellieber, and M.. Assenmacher, "Pre-trained Language Models as Knowledge Bases for Automotive Complaint Analysis," arXiv:2012.02558, Dec, 2020.
11	Y. Gu, Z. Zhang, X. Wang, Z. Liu, and M. Sun, "Train No Evil: Selective Masking for Task-guided Pre-training," Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pp. 6966-6974, Nov, 2020.
12	Wikipedia, https://ko.wikipedia.org/wiki/대한민국의_인터넷_신조어_목록.
13	J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling," arXiv:1412.3555, Dec, 2014.
14	A. Tan, "Text Mining: The State of the Art and the Challenges," Proceedings of the PAKDD Workshop on Knowledge Discovery from Advanced Databases, pp. 65-70 Jan, 1999.
15	M. Peter, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, "Deep Contextualized Word Representations," Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, Vol. 1, pp. 2227-2237, Mar, 2018.
16	Y. Kim, "Convolutional Neural Networks for Sentence Classification," arXiv:1408.5882, Sep, 2014.
17	J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv:1810.04805, May, 2018.
18	T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient Estimation of Word Representations in Vector Space," Proceedings of the International Conference on Learning Representations, ICLR, Sep, 2013.
19	P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching Word Vectors with Subword Information," Transactions of the Association for Computational Linguistics, Vol. 5, pp. 135-146, Jun, 2017. DOI
20	S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Journal of Neural Computation, Vol. 9, No. 8, pp. 1735-1780, Nov, 1997. DOI
21	D. Bahdanau, C. Kyunghyun, and B. Yoshua, "Neural Machine Translation by Jointly Learning to Align and Translate," arXiv:1409.0473, May, 2014.
22	A. Adhikari, A. Ram, R. Tang, and J. Lin, "DocBERT: BERT for Document Classification," arXiv:1904.08398, Apr, 2019.
23	Naver Blog, https://blog.naver.com/PostView.nhn?blogId=maryjane1440&logNo=221521383120.
24	C. Sung, T. Dhamecha, S. Saha, T. Ma, V. Reddy, and R. Arora, "Pre-training BERT on Domain Resources for Short Answer Grading," Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, pp. 6071-6075, Nov, 2019.
25	T. Mikolov, M. Karafiat, L. Burget, and J. Cernocky, "Recurrent Neural Network Based Language Model," Eleventh Annual Conference of the International Speech Communication Association, Sep, 2010.