DOI QR코드

DOI QR Code

Application of Improved Variational Recurrent Auto-Encoder for Korean Sentence Generation

한국어 문장 생성을 위한 Variational Recurrent Auto-Encoder 개선 및 활용

  • 한상철 (한동대학교 정보통신공학과) ;
  • 홍석진 (SK Telecom AI 사업단) ;
  • 최희열 (한동대학교 전산전자공학부)
  • Received : 2017.11.03
  • Accepted : 2017.12.18
  • Published : 2018.02.15

Abstract

Due to the revolutionary advances in deep learning, performance of pattern recognition has increased significantly in many applications like speech recognition and image recognition, and some systems outperform human-level intelligence in specific domains. Unlike pattern recognition, in this paper, we focus on generating Korean sentences based on a few Korean sentences. We apply variational recurrent auto-encoder (VRAE) and modify the model considering some characteristics of Korean sentences. To reduce the number of words in the model, we apply a word spacing model. Also, there are many Korean sentences which have the same meaning but different word order, even without subjects or objects; therefore we change the unidirectional encoder of VRAE into a bidirectional encoder. In addition, we apply an interpolation method on the encoded vectors from the given sentences, so that we can generate new sentences which are similar to the given sentences. In experiments, we confirm that our proposed method generates better sentences which are semantically more similar to the given sentences.

딥러닝의 급속한 발전은 패턴인식 분야의 성능을 혁신했으며 몇몇 문제에서는 인간 수준을 넘어서는 결과들을 보여주고 있다. 데이타를 분류하는 패턴인식과 달리 본 논문에서는 주어진 몇개의 한국어 문장으로부터 비슷한 문장들을 생성하는 문제를 다룬다. 이를위해 생성모델 중의 하나인 Variational Auto-Encoder 기반의 모델을 한국어 생성에 맞게 개선하고 적용하는 방법들을 논의한다. 첫째, 교착어인 한국어의 특성상 띄어쓰기를 기준으로 단어 생성시 단어의 개수가 너무 많아 이를 줄이기 위해 조사 및 어미들을 분리할 필요가 있다. 둘째, 한국어는 어순이 비교적 자유롭고 주어 목적어 등이 생략되는 경우가 많아 기존의 단방향 인코더를 양방향으로 확장한다. 마지막으로, 주어진 문장들을 기반으로 비슷하지만 새로운 문장들을 생성하기 위해 기존 문장들의 인코딩된 벡터표현들로부터 새로운 벡터를 찾아내고, 이 벡터를 디코딩하여 문장을 생성한다. 실험 결과를 통해 제안한 방법의 성능을 확인한다.

Keywords

Acknowledgement

Supported by : National Research Foundation of Korea(NRF)

References

  1. Y. LeCun, Y. Bengio, and G. E. Hinton, "Deep learning," Nature, Vol. 521, No. 7553, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
  2. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Neural Information Processing Systems, pp. 1097-1105, 2012.
  3. G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T.N. Sainath, and B. Kingsbur, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal processing magazine, pp. 82-97, 2012.
  4. I. Goodfellow, Y. Bengio, and A. Courville, "Deep Learning," MIT Press, 2016.
  5. I. Goodfellow, J. Pouget-Abadie, and M. Mirza, "Generative Adversarial Networks," arXiv Preprint arXiv:1406.2661, 2014.
  6. D. P. Kingma, and M. Welling, "Auto-Encoding Variational Bayes," arXiv Preprint arXiv:1312.6114v10, 2014.
  7. R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training recurrent neural networks," International Conference on Machine Learning, pp. 1310-1318, 2012.
  8. S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio, "Generating Sentences from a Continuous Space," International Conference on Learning Representations, pp. 1-13, 2016.
  9. R. Kiros, Y. Zhu, R. Salakhutdinov, R. S. Zemel, A. Torralba, R. Urtasun, and S. Fidler, "Skip-Thought Vectors," Neural Information Processing Systems, 2015.
  10. S. Hochreiter, and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, Vol. 9, No. 8, pp. 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735
  11. T. Mikolov, A. Deoras, S. Kombrink, L. Burget, and J. Cernocky, "Empirical Evaluation and combination of advanced language modeling techniques," Interspeech, 2011.
  12. T. Mikolov, M. Karafiat, J. Cernocky, and S. Khudanpur, "Recurrent neural network based language model," Interspeech, 2010.
  13. M. Sundermeyer, and H. Ney, "From Feedforward to Recurrent LSTM Neural Networks for Language Modeling," IEEE Trans. On Audio, Speech, and Language Processing, Vol. 23, No. 3, pp. 517-529, 2015. https://doi.org/10.1109/TASLP.2015.2400218
  14. H. Choi, K. Cho, and Y. Bengio, "Context-dependent word representation for neural machine translation," Computer Speech & Language, Vol. 45, pp. 149-160, 2017. https://doi.org/10.1016/j.csl.2017.01.007
  15. A. Graves, N. Jaitly, and A. Mohamed, "Hybrid speech recognition with Deep Bidirectional LSTM," IEEE Automatic Speech Recognition and Understanding, pp. 273-278, 2013.
  16. T. Moon, H. Choi, H. Lee, and I. Song, "Rnndrop: A novel dropout for RNNs in ASR," IEEE Automatic Speech Recognition and Understanding, 2015.
  17. P. Smit, S. Virpioja, S.-A. Gronroos, and M. Kurimo, "Morfessor 2.0: Toolkit for statistical morphological segmentation," Conference of the European Chapter of the Association for Computational Linguistics, pp. 21-24, 2014.
  18. J. Bayer, and C. Osendorfer, "Learning Stochastic Recurrent Networks," arXiv Preprint arXiv:1411.7610, 2014.
  19. J. Chung, K. Kastner, L. Dinh, K. Goel, A. Courville, and Y. Bengio, "A Recurrent Latent Variable Model for Sequential Data," arXiv Preprint arXiv:1506.02216, 2015.