DOI QR코드

DOI QR Code

Deep Neural Architecture for Recovering Dropped Pronouns in Korean

  • Received : 2017.04.06
  • Accepted : 2018.01.08
  • Published : 2018.04.01

Abstract

Pronouns are frequently dropped in Korean sentences, especially in text messages in the mobile phone environment. Restoring dropped pronouns can be a beneficial preprocessing task for machine translation, information extraction, spoken dialog systems, and many other applications. In this work, we address the problem of dropped pronoun recovery by resolving two simultaneous subtasks: detecting zero-pronoun sentences and determining the type of dropped pronouns. The problems are statistically modeled by encoding the sentence and classifying types of dropped pronouns using a recurrent neural network (RNN) architecture. Various RNN-based encoding architectures were investigated, and the stacked RNN was shown to be the best model for Korean zero-pronoun recovery. The proposed method does not require any manual features to be implemented; nevertheless, it shows good performance.

Keywords

References

  1. N.-R. Han, "Korean Zero Pronouns: Analysis and Resolution," Ph.D. Dissertation, Univ. Pennsylvania, Philladelphia, USA, 2006.
  2. S. Zhao and H.T. Ng, "Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach," Proc. Conf. EMNLP-CoNLL, Prague, Czech Rep., June 28- 30, 2007, pp. 541-550.
  3. W. Yang, R. Dai, and X. Cui, "Zero Pronoun Resolution in Chinese Using Machine Learning Plus Shallow Parsing," Proc. Int. Conf. Inform. Autom., Changsha, China, June 20- 23, 2008, pp. 905-910.
  4. C. Giannella, R. Winder, and S. Petersen, "Dropped Pronoun Recovery in Chinese SMS," Technical Paper, 2015.
  5. Y. Yang, Y. Liu, and N. Xu, "Recovering Dropped Pronouns from Chinese Text Messages," Proc. Annu. Meet. Assoc. Comput. Linguistics Int. Conf. Natural Language Process., Beijing, China, July 26-31, 2015, pp. 309-313.
  6. S.J. Lee, C. Lee, and M. Jang, "Restoring an Elided Title for Encyclopedia QA System," Proc. Korean Inform. Sci. Soc. Conf., vol. 32, no. 2, Nov. 2005, pp. 541-543.
  7. M.-K. Hwang, Y. Kim, D. Ra, S. Lim, and H. Kim, "Restoring Omitted Sentence Constituents in Encyclopedia Documents Using Structural SVM," J. Intell. Inform. Syst., vol. 21, no. 2, 2015, pp. 131-150. https://doi.org/10.13088/jiis.2015.21.2.131
  8. S.P. Converse, "Pronominal Anaphora Resolution in Chinese," Ph.D. Dissertation, Univ. Pennsylvania, Philladelphia, USA, 2006.
  9. P. Jing and K. Araki, "Zero-Anaphora Resolution in Chinese Using Maximum Entropy," IEICE Trans. Inform. Syst., vol. 90, no. 7, 2007, pp. 1092-1102.
  10. F. Kong and G. Zhou, "A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution," Proc. Conf. Empirical Methods Natural Language Process., Cambridge, MA, USA, Oct. 9-11, 2010, pp. 882-891.
  11. Y. Yang and N. Xue, "Chasing the Ghost: Recovering Empty Categories in the Chinese Treebank," Proc. Int. Conf. Comput. Linguistics: Posters, Beijing, China, Aug. 23-27, 2010, pp. 1382-1390.
  12. T. Chung and D. Gildea, "Effects of Empty Categories on Machine Translation," Proc. Conf. Empirical Methods Natural Language Process., Cambridge, MA, USA, Oct. 9- 11, 2010, pp. 636-645.
  13. S. Cai, D. Chiang, and Y. Goldberg, "Language- Independent Parsing with Empty Elements," Proc. Annu. Meet. Assoc. Comput Linguistics: Human Language Technolo., Prtland, OR, USA, June 19-24, 2011, pp. 212- 216.
  14. C. Chen and V. Ng, "Chinese Zero Pronoun Resolution with Deep Neural Networks," Proc. Annu. Assoc. Comput. Linguistics, Berlin, German, Aug. 7-12, 2016, pp. 778-788.
  15. L. Wang, X. Zhang, Z. Tu, H. Li, and Q. Liu, "Dropped Pronoun Generation for Dialogue Machine Translation," IEEE Int. Conf. Acoust., Speech Signal Process., Shanghai, China, Mar. 20-25, 2016, pp. 6110-6114.
  16. A. Park, S. Lim, and M. Hong, "Zero Object Resolution in Korean," Proc. Pacific Asia Conf. Language Inform. Comput., Shanghai, China, Oct. 30-Nov. 1, 2015, pp. 439- 448.
  17. T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient Estimation of Word Representations in Vector Space," arXiv Prepr., arXiv1301.3781, 2013.
  18. K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdaunau, F. Bougares, H. Schwenk, and Y. Bengio, "Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation," arXiv Prepr., arXiv1406.1078, 2014.
  19. S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Comput., vol. 9, no. 8, 1997, pp. 1735- 1780. https://doi.org/10.1162/neco.1997.9.8.1735
  20. Y. Bengio, ""Learning Deep Architectures for AI", Found. Trends(R)," Mach. Learn., vol. 2, no. 1, Jan. 2009, pp. 1- 127. https://doi.org/10.1561/2200000006
  21. J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling," arXiv Prepr., arXiv1412.3555, 2014.
  22. M. Hermans and B. Schrauwen, "Training and Analyzing Deep Recurrent Neural Networks," Proc. Int. Conf. Neural Inform. Process. Syst., Lake Tahoe, NV, USA, Dec. 5-10, 2013, pp. 190-198.
  23. H. Sak, A.W. Senior, and F. Beaufays, "Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling," Annu. Conf. Int. Speech Commun. Assoc., Singapore, Sept. 14-18, 2014, pp. 338-342.

Cited by

  1. Deep learning reservoir porosity prediction based on multilayer long short-term memory network vol.85, pp.4, 2018, https://doi.org/10.1190/geo2019-0261.1
  2. Zero-anaphora resolution in Korean based on deep language representation model: BERT vol.43, pp.2, 2018, https://doi.org/10.4218/etrij.2019-0441