Browse > Article
http://dx.doi.org/10.7840/kics.2016.41.5.592

LSTM Language Model Based Korean Sentence Generation  

Kim, Yang-hoon (Automation and Systems Research Institute (ASRI), Department of Electrical and Computer Engineering, Seoul National University)
Hwang, Yong-keun (Department of Electrical and Computer Engineering, Seoul National University)
Kang, Tae-gwan (Department of Electrical and Computer Engineering, Seoul National University)
Jung, Kyo-min (Automation and Systems Research Institute (ASRI), Department of Electrical and Computer Engineering, Seoul National University)
Abstract
The recurrent neural network (RNN) is a deep learning model which is suitable to sequential or length-variable data. The Long Short-Term Memory (LSTM) mitigates the vanishing gradient problem of RNNs so that LSTM can maintain the long-term dependency among the constituents of the given input sequence. In this paper, we propose a LSTM based language model which can predict following words of a given incomplete sentence to generate a complete sentence. To evaluate our method, we trained our model using multiple Korean corpora then generated the incomplete part of Korean sentences. The result shows that our language model was able to generate the fluent Korean sentences. We also show that the word based model generated better sentences compared to the other settings.
Keywords
LSTM; Recurrent Neural Networks; Language Model; Sentence Generation;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 S. Bird, E. Klein, and, E. Loper Natural Language Processing with Python, O'Reilly Media Inc., Jun. 2009.
2 S. H. Gil and G. H. Kim, "Vision-based vehicle detection and tracking using online learning," J. KICS, vol. 39A, no. 1, pp. 1-11, 2014.   DOI
3 J. H. Moon, et al., "Case study of big data-based agri-food recommendation system according to types of customers," J. KICS, vol. 40, no. 5, pp. 903-913, 2015.   DOI
4 O. Russakovskym, et al., "ImageNet large scale visual recognition challenge," Int. J. Comput. Vis., vol. 115, no. 3, pp. 211-252, Dec. 2015.   DOI
5 S. Kumar, et al., "Localization estimation using artificial intelligence technique in wireless sensor networks," J. KICS, vol. 39C, no. 9, pp. 820-827, 2014.   DOI
6 Y. Bengio, et al., "Leanring long-term dependencies with gradient descent is difficult," IEEE Trans. Neural Netw., vol. 5, no. 2, pp. 157-166, 1994.   DOI
7 S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.   DOI
8 H. Sak, et al., "Fast and accurate recurrent neural network acoustic models for speech recognition," in Proc. INTERSPEECH, Dresden, Germany, Sept. 2015.
9 A. Graves, et al., "A novel connectionist system for unconstrained handwriting recognition," IEEE Trans. PAMI, vol. 31 no. 5, pp. 855-868, May 2009.   DOI
10 A. Grushin, et al., "Robust human action recognition via long short-term memory," in IJCNN, pp. 1-8, Dallas, United States, Aug. 2013.
11 S. Shin, et al., "Image classification with recurrent neural networks using input replication," J. KICS, vol. 2015, no. 06, pp. 868-869, 2015.
12 P. Koehn, et al., "Moses: Open source toolkit for statistical machine translation," in ACL, pp. 177-180, Prague, Czech, Jun. 2007.
13 T. Mikolov, et al., "Extensions of recurrent neural network language model," in ICASSP, pp. 5528-5531, Prague, Czech, May 2011.
14 T. Mikolov, "Statistical language models based on neural network," Ph.D. Dissertation, Brno University of Technology, 2012.
15 I. Sutskever, et al., "Generating text with recurrent neural networks," in ICML, Bellevue, United States, Jun. 2011.
16 T. Mikolov, et al., Subword language modeling with neural networks, preprint(http://www.fit.vutbr.cz/imikolov/rnnlm/char.pdf), 2012.
17 Wkipedia, Recurrent neural network, Retrieved 3rd, Dec, 2015, https://en.wikipedia.org/wiki/Recurrent_neural_network.
18 M. C. Mozer, A focused backpropagation algorithm for temporal pattern recognition, L. Erlbaum Associates Inc., pp. 137-169, 1995.
19 S. Kirkpatrick, et al., "Optimization by simulated annealing," Science, pp. 671-680, 1983.
20 Korean Bible Society, Bible, Retrieved 7th, Dec, 2015, http://www.bskorea.or.kr.
21 jungyeul, korean-parallel-corpora(2014), Retrieved 22th, Oct, 2015, https://github.com/jungyeul/korean-parallel-corpora/tree/master/korean-english-v1.
22 Twitter, twitter-korean-text, Retrieved 10th, Nov. 2015, https://github.com/twitter/twitter-korean-text.
23 F. Bastien, et al., "Theano: new features and speed improvements," in NIPS deep learning workshop, Lake Tahoe, United States, Dec. 2012.
24 J. Bergstra, et al., "Theano: A CPU and GPU math expression compiler," in Proc. SciPy, 2010.