Hypernetwork-based Natural Language Sentence Generation by Word Relation Pattern Learning

단어 간 관계 패턴 학습을 통한 하이퍼네트워크 기반 자연 언어 문장 생성

  • 석호식 (서울대학교 컴퓨터공학부) ;
  • 작가멧 (서울대학교 컴퓨터공학부) ;
  • 장병탁 (서울대학교 컴퓨터공학부)
  • Received : 2009.11.23
  • Accepted : 2010.01.05
  • Published : 2010.03.15

Abstract

We introduce a natural language sentence generation (NLG) method based on learning of word-association patterns. Existing NLG methods assume the inherent grammar rules or use template based method. Contrary to the existing NLG methods, the presented method learns the words-association patterns using only the co-occurrence of words without additional information such as tagging. We employ the hypernetwork method to analyze and represent the words-association patterns. As training going on, the model complexity is increased. After completing each training phase, natural language sentences are generated using the learned hyperedges. The number of grammatically plausible sentences increases after each training phase. We confirm that the proposed method has a potential for learning grammatical properties of training corpuses by comparing the diversity of grammatical rules of training corpuses and the generated sentences.

본 논문에서는 단어간 관계 패턴을 학습한 후 이에 기반하여 자연 언어 문장을 생성하는 방법을 소개한다. 기존의 문장 생성 방법론에서는 내재된 문법 규칙의 존재를 가정하거나 템플릿을 사용하고 있으나, 본 논문에서 소개하는 방법론에서는 태깅 등의 부가 정보 없이 단어의 동시 등장 빈도만을 활용하여 단어간 관계 패턴을 학습한다. 단어간 관계 패턴은 하이퍼네트워크 방법론에 기반하여 학습되었다. 학습이 진행됨에 따라 하이퍼네트워크의 복잡도가 높아지며, 학습 모델에 축적되는 언어 관계 패턴의 수가 증가한다. 학습된 모텔의 유효성은 학습 패턴에 기반한 자연 언어 문장 생성을 통해 확인하였다. 실험 결과 학습이 진행됨에 따라 문법적으로 성립하는 문장의 비율이 향상하였다. 파서를 이용하여 생성된 문장을 구성하는 문법 규칙을 분석한 후 문법 규칙의 분포를 학습에 사용한 코퍼스의 문법 규칙 분포와 비교한 결과 학습에 사용된 코퍼스의 문법적 특성을 학습할 수 있는 잠재력을 갖고 있음을 확인하였다.

Keywords

References

  1. E. Reiter and R. Dale, "Building Applied Natural Language Generation Systems," Natural Language Engineering, vol.3, no.1, pp.57-87, 1995.
  2. D. Traum, M. Fleischman, and E. Hovy, "NL Generation for Virtual Humans in a Complex Social Environment," Proceedings of the AAAI Spring Symposium on Natural Language Generation in Spoken and Written Dialogue, pp.151- 158, 2003.
  3. B.-T. Zhang, "Hypernetworks: A Molecular Evolutionary Architecture for Cognitive Learning and Memory," IEEE Computational Intelligence Magazine, vol.3, no.3, pp.49-63, 2008.
  4. http://childes.psy.cmu.edu/data/
  5. D. J. C. MacKay, "Information Theory, Inference, and Learning Algorithms," Cambridge University Press, 2005.
  6. L. Bahl, F. Jelinek, and R. Mercer, "A Maximum Likelihood Approach to Continuous Speech Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.5, no.2, pp. 179-190, 1983.
  7. http://nlp.stanford.edu/software/lex-parser.shtml
  8. N. Chomsky, "Aspects of the Theory of Syntax," MIT Press, 1965.
  9. R. L. Gmez and L. Gerken, "Infant Artificial Language Learning and Language Acquisition," Trends in Cognitive Sciences, vol.4, no.5, pp.178-186, 2000. https://doi.org/10.1016/S1364-6613(00)01467-4
  10. D. McAllester and R. E. Schapire, "Learning Theory and Language Modeling," Exploring Artificial Intelligence in the New Millennium, Morgan Kaufmann Publishers, pp.271-287, 2003.
  11. J. R. Saffran, "Statistical Language Learning: Mechanisms and Constraints," Current Directions in Psychological Science, vol.12, no.4, pp. 110-114, 2003. https://doi.org/10.1111/1467-8721.01243
  12. C. Yu and D. H. Ballard, "A Unified Model of Early Word Learning: Integrating Statistical and Social Cues," Neurocomputing, vol.70, pp.2149- 2165. 2007. https://doi.org/10.1016/j.neucom.2006.01.034
  13. R. Rosenfeld, "Two Decades of Statistical Language Modeling: Where Do We Go From Here?," Proceedings of IEEE, vol.88, pp.1270-1278, 2000. https://doi.org/10.1109/5.880083
  14. A. L. Buchsabau and R. Giancarlo, "Algorithmic Aspects in Speech Recognition: An Introduction," Journal of Experimental Algorithmics, vol.2, no.1, pp.1-44, 1997. https://doi.org/10.1145/264216.264219
  15. R. Rosenfeld, "A Maximum Entropy Approach to Adaptive Statistical Language Modeling," Computer Speech and Language, vol.10, pp.187-228, 1996. https://doi.org/10.1006/csla.1996.0011
  16. C. Chelba, et al, "Structure and Performance of a Dependency Language Model," Proceedings of Eurospeech 97, pp.2775-2778, 1997.
  17. B.-T. Zhang and C.-H. Park, "Self-Assemblying Hypernetworks for Cognitive Learning of Linguistic Memory," Proceedings of International Conference on Computer, Electrical, and Systems Science and Engineering, vol.27, pp.134-138, 2008.
  18. C. Mellish and R. Evans, "Implementing Architecture for Natural Language Generation," Natural Language Engineering, vol.10, no.3/4, pp. 261-282, 2004. https://doi.org/10.1017/S1351324904003511
  19. J.-H. Lee, E. S. Lee, and B.-T. Zhang, "A Hypernetwork Memory-based Model of Sentence Learning and Generation in Children: How a Child Learns to Produce Language from a Video Corpus," KCC2009, pp.134-138, 2009.
  20. H.-S. Seok, J. Bootkrajang, and B.-T. Zhang, "Automatic Grammar Induction by Incrementally Learning a Hypernetwork Model: Sentence Generation and Analysis," The 36th KIISE Fall Conference, pp.221-224, 2009.
  21. D. L. Chen and R. J. Mooney, "Learning to Sportcast: A Test of Grounded Language Acquisition," Proceedings of the 25th International Conference on Machine Learning, pp.128-135, 2008.
  22. E. Reither, S. Sripada, J. Hunter, J. Yu, and I. Davy, "Choosing Words in Computer-generated Weather Forecasts," Artificial Intelligence, vol.167, no.1/2, pp.137-169, 2005. https://doi.org/10.1016/j.artint.2005.06.006