Browse > Article
http://dx.doi.org/10.7583/JKGS.2019.19.5.103

A Study on Improved Comments Generation Using Transformer  

Seong, So-yun (Dept. of Game and Multimedia Engineering, Korea Polytechnic University)
Choi, Jae-yong (Dept. of Game and Multimedia Engineering, Korea Polytechnic University)
Kim, Kyoung-chul (Dept. of Game and Multimedia Engineering, Korea Polytechnic University)
Abstract
We have been studying a deep-learning program that can communicate with other users in online communities since 2017. But there were problems with processing a Korean data set because of Korean characteristics. Also, low usage of GPUs of RNN models was a problem too. In this study, as Natural Language Processing models are improved, we aim to make better results using these improved models. To archive this, we use a Transformer model which includes Self-Attention mechanism. Also we use MeCab, korean morphological analyzer, to address a problem with processing korean words.
Keywords
Deep Learning; Natural Language Processing; Self-Attention; Transformer;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Cheng, Jianpeng, Li Dong, and Mirella Lapata. "Long short-term memory-networks for machine reading.", arXiv preprint arXiv:1601.06733, 2016.
2 Jakob Uszkoreit, "Transformer: A Novel Neural Network Architecture for Language Understanding", https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html, 2017
3 Damien Sileo, "Understanding BERT Transformer: Attention isn't all you need", https://medium.com/synapse-dev/understanding-bert-transformer-attention-isnt-all-you-need-5839ebd396db, 2019.
4 https://reniew.github.io/43/
5 leod. "Generate Hacker News Comments from Titles", https://github.com/leod/hncynic.
6 Shibata, Yusuxke, et al. "Byte Pair encoding: A text compression scheme that accelerates pattern matching.", Technical Report DOI-TR-161, Department of Informatics, Kyushu University, 1999.
7 "The Stanford Question Answering Dataset", https://rajpurkar.github.io/SQuAD-explorer/
8 Peters, Matthew E., et al. "Deep contextualized word representations.", arXiv preprint arXiv:1802.05365, 2018.
9 kakao, "Kakao Hangul Analyzer III", https://github.com/kakao/khaiii
10 Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks.", IEEE Transactions on Signal Processing 45.11, pp.2673-2681, 1997.   DOI
11 eunjeon,"mecab-ko-dic", https://bitbucket.org/eunjeon/mecab-ko-dic/src/master/
12 tensorflow, "Models and Examples built with Tensorflow", https://github.com/tensorflow/models
13 Papineni, Kishore, et al. "BLEU: a method for automatic evaluation of machine translation.", Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, 2002.
14 Banerjee, Satanjeev, Alon Lavie. "METEOR: An automatic metric for MT evaluation with improved correlation with human judgments.", Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization. 2005.
15 Liu, Chia-Wei, et al. "How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation.", arXiv preprint arXiv:603.08023, 2016.
16 Vaswani, Ashish, et al. "Attention is all you need.", Advances in neural information processing systems, pp.5998-6008, 2017.
17 J. Choi, S. Sung, K. Kim. "A Study on Automatic Comment Generation Using Deep Learning", Journal of Korea Game Society, 18(5), pp 83-92, 2018.   DOI
18 Stroh, Eylon, and Priyank Mathur. "Question answering using deep learning.", 2016
19 Tang, Gongbo, et al. "Why self-attention? a targeted evaluation of neural machine translation architectures.", arXiv preprint arXiv:1808.08946, 2018.
20 Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate.", arXiv preprint arXiv:1409.0473, 2014.
21 Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding.", arXiv preprint arXiv:1810.04805, 2018.
22 Radford, Alec, et al. "Improving language understanding by generative pre-training.", https://s3-us-west-2.amazonaws.com/openai-assets/researchcovers/languageunsupervised/languageunderstandingpaper.pdf, 2018.
23 Yang, Zhilin, et al. "XLNet: Generalized Autoregressive Pretraining for Language Understanding.", arXiv preprint arXiv:1906.08237, 2019.
24 Song, Kaitao, et al. "Mass: Masked sequence to sequence pre-training for language generation.", arXiv preprint arXiv:1905.02450, 2019.
25 Sherstinsky, Alex. "Fundamentals of recurrent neural network (rnn) and long short-term memory (lstm) network.", arXiv preprint arXiv:1808.03314, 2018.
26 Hochreiter, Sepp, and Jurgen Schmidhuber. "Long short-term memory.", Neural computation 9.8, pp.1735-1780, 1997.   DOI
27 Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv: 1412.3555, 2014.
28 Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks.", Advances, in neural information processing systems, 2014.