Efficient Semantic Structure Analysis of Korean Dialogue Sentences using an Active Learning Method

능동학습법을 이용한 한국어 대화체 문장의 효율적 의미 구조 분석

  • 김학수 (강원대학교 컴퓨터정보통신공학)
  • Published : 2008.05.15

Abstract

In a goal-oriented dialogue, speaker's intention can be approximated by a semantic structure that consists of a pair of a speech act and a concept sequence. Therefore, it is very important to correctly identify the semantic structure of an utterance for implementing an intelligent dialogue system. In this paper, we propose a model to efficiently analyze the semantic structures based on an active teaming method. To reduce the burdens of high-level linguistic analysis, the proposed model only uses morphological features and previous semantic structures as input features. To improve the precisions of semantic structure analysis, the proposed model adopts CRFs(Conditional Random Fields), which show high performances in natural language processing, as an underlying statistical model. In the experiments in a schedule arrangement domain, we found that the proposed model shows similar performances(92.4% in speech act analysis and 89.8% in concept sequence analysis) to the previous models although it uses about a third of training data.

목적 지향성 대화에서 화자의 의도는 화행과 개념열 쌍으로 구성되는 의미 구조로 근사화될 수 있다. 그러므로 지능형 대화 시스템을 구현하기 위해서는 의미 구조를 올바르게 파악하는 것이 매우 중요하다. 본 논문에서는 능동학습(active learning) 방법을 이용하여 효율적으로 의미 구조를 분석하는 모델을 제안한다. 제안 모델은 언어 분석에 따른 부담을 덜기위하여 형태소 자질들과 이전 의미 구조만을 입력 자질로 사용한다. 그리고 정확률 향상을 위하여 자연어 처리 분야에서 높은 성능을 보이고 있는 CRFs(Conditional Random Fields)를 기본 통계 모델로 사용한다. 일정 관리 영역에서 제안 모델을 실험한 결과는 기존 모델들과 비교하여 1/3 정도의 훈련데이타를 사용하고도 비슷한 정확률(화행 92.4%, 개념열 89.8%)을 나타내고 있음을 알 수 있었다.

Keywords

References

  1. Levin, L., Langley, C., Lavie, A., Gates, D. Wallace, D., and Peterson, K., "Domain specific speech acts for spoken language translation," in Proceedings of 4th SIGdial Workshop on Discourse and Dialogue, 2003
  2. Caberry, S., A pragmatics-based approach to ellipsis resolution. Computational Linguistics, Vol.15, No.2, pp. 75-96, 1989
  3. Lambert, L. and Caberry, S., "A tripartite plan- based model of dialogue," in Proceedings of ACL 1991, pp. 47-54, 1991
  4. Litman, D. J. and Allen, J. F., A plan recognition model for subdialogues in conversations, Cognitive Science, Vol.11, pp. 163-200, 1987 https://doi.org/10.1207/s15516709cog1102_4
  5. Langley, C., "Analysis for speech translation using grammar-based parsing and automatic classification," in Proceedings of the ACL Student Research Workshop, 2002
  6. Lee, S. and Seo, J., "An analysis of Korean speech act using hidden Markov model with decision trees," in Proceedings of ICCPOL 2001, pp. 397- 400, 2001
  7. Samuel, K., Caberry, S., and Vijay-Shanker, K., "Computing dialogue acts from features with transform-based learning," in Proceedings of the AAAI Spring Symposium, pp. 90-97, 1998
  8. 이현정, 한국어 대화체 문장의 화행 분석. 석사학위논문, 서강대학교, 1997
  9. Goddeau, D., Meng, H., Polifroni, J., Seneff, S., and Busayapongchai, S., "A form-based dialogue manager for spoken language applications," in Proceedings of International Conference on Spoken Language Processing, pp. 701-704, 1996
  10. Kim, H., Seon, C., and Seo, J., A dialogue-based information retrieval assistant using shallow NLP techniques in online sales domains, IEICE Information and Systems, Vol.E88D, No.5. pp. 801-808, 2005
  11. Kim, H., A dialogue-based NLIDB system in a schedule management domain. Lecture Notes in Computer Science, Vol.4362, pp. 869-877, 2007
  12. Lee, H., Kim, H., and Seo, J., Efficient domain action classification using neural networks. Lecture Notes in Computer Science, Vol.4233, pp. 150-158, 2006
  13. 김경선, 서정연, 자질 선택 기법을 이용한 한국어 화행 결정, 한국정보과학회 논문지, 제30권 3호, pp. 278- 284, 2003
  14. Yang, Y. and Pedersen, J. O., "A comparative study on feature selection in text categorization," in Proceedings of ICML 1997, 1997
  15. Lafferty, J., McCallum, A., and Pereira, F., "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," in Proceedings of ICML 2001, 2001
  16. Pinto, D., McCallum, A., Wei, X., and Croft, W. B., "Table extraction using conditional random fields," in Proceedings of SIGIR 2003, 2003
  17. Cohn, D., Ghahramani, Z., Jordan, M., Active learning with statistical models. Journal of Artificial Intelligence Research, Vol.4, pp. 129-145, 1999
  18. Riccardi, G. and Hankkani-Tur, D., Active learning: theory and applications to automatic speech recognition, IEEE Transactions on Speech and Audio Processing, Vol.13, No.4, pp. 504-511, 2005 https://doi.org/10.1109/TSA.2005.848882
  19. Fei, S. and Pereira, F., "Shallow parsing with conditional random fields," in Proceedings of HLT & NAACL 2003, 2003
  20. 은종민, 이성욱, 서정연, 지지벡터기계(support vector machines)를 이용한 한국어 화행분석, 한국정보처리학회 논문지, 제12B권 3호, pp. 365-368, 2005 https://doi.org/10.3745/KIPSTB.2005.12B.3.365