An Effective Adaptive Dialogue Strategy Using Reinforcement Loaming

강화 학습법을 이용한 효과적인 적응형 대화 전략

  • 김원일 (삼성전자 영상디스플레이 사업부) ;
  • 고영중 (동아대학교 컴퓨터공학과) ;
  • 서정연 (서강대학교 컴퓨터학과)
  • Published : 2008.01.15

Abstract

In this paper, we propose a method to enhance adaptability in a dialogue system using the reinforcement learning that reduces response errors by trials and error-search similar to a human dialogue process. The adaptive dialogue strategy means that the dialogue system improves users' satisfaction and dialogue efficiency by loaming users' dialogue styles. To apply the reinforcement learning to the dialogue system, we use a main-dialogue span and sub-dialogue spans as the mathematic application units, and evaluate system usability by using features; success or failure, completion time, and error rate in sub-dialogue and the satisfaction in main-dialogue. In addition, we classify users' groups into beginners and experts to increase users' convenience in training steps. Then, we apply reinforcement learning policies according to users' groups. In the experiments, we evaluated the performance of the proposed method on the individual reinforcement learning policy and group's reinforcement learning policy.

인간은 다른 사람과 대화할 때, 시행착오 과정을 거치면서 상대방에 관한 학습이 일어난다. 본 논문에서는 이런 과정의 강화학습법(Reinforcement Learning)을 이용하여 대화시스템에 적응형 능력의 부여 방법을 제안한다. 적응형 대화 전략이란 대화시스템이 사용자의 대화 처리 습성을 학습하고, 사용자 만족도와 효율성을 높이는 것을 말한다. 강화 학습법을 효율적으로 대화처리 시스템에 적용하기 위하여 대화를 주 대화와 부대화로 나누어 정의하고 사용하였다. 주 대화에서는 전체적인 만족도를, 부 대화에서는 완료 여부, 완료시간, 에러 횟수를 이용해서 시스템의 효율성을 측정하였다. 또한 학습 과정에서의 사용자 편의성을 위하여 시스템 사용 역량에 따라 사용자를 두 그룹으로 분류한 후 해당 그룹의 강화 학습 훈련 정책을 적용하였다. 실험에서는 개인별, 그룹별 강화 학습에 따라 제안한 방법의 성능을 평가하였다.

Keywords

References

  1. T. Dean J.Allen, Y. Aloimomos, Artificial Intelligence Theory and Practice, Addison-Wesley, 1995
  2. I. Zukerman, D. Litman, 'Natural Language Processing and User Modeling: Synergies and Limitations,' User Modeling and User-Adapted Interaction Vol.11, pp. 129-158, 2001 https://doi.org/10.1023/A:1011174108613
  3. J.W. Wallis, E.H. Shortliffe, 'Customized Explanations Using Causal Knowledge,' In B.C. Buchanan, E.H. Shortliffe(eds): Rule-based Expert System: The MYCIN Experiments of the standford Heuristic Programming Project, Addison-Wesley Publishing Company, pp. 371-388, 1985
  4. M. McTear, Spoken Dialogue Technology Toward the Conversational User Interface, Springer-Verlag London, 2004
  5. M. A. Walker, D. Litman, C. A. Kamm, A. Abella 'PARADISE: A Framework for Evaluating Spoken Dialogue Agents,' In Proceedings of the 35th Annual Meeting of the Association of Computational Linguistics(ACL 97), pp. 271-280, 1997
  6. K. Jokinen, K. Kanto, 'User Expertise Modeling and Adaptive in a Speech-Based E-mail System,' In proceedings of Annual Meeting of the Association of Computational Linguistics 2004(ACL 2004), pp. 87-94, 2004
  7. S. Moller, 'A new Taxonomy of the Quality of Telephone Service Based on Spoken Dialogue System,' In proceedings of the 3 th SIGdial Workshop on Discourse and Dialogue, Philadel phia, PA. pp. 142-153, 2002
  8. D. Litman, S. Pan, 'Empirically Evaluating an adaptable spoken dialogue systems,' In Proceedings of the 7th International Conference on User Modeling(UM'99), pp. 55-64, 1999
  9. 은지현, 최준기, 장두성, 김현정, 구명완, '마르코프 의사결정 과정에 기반한 대화 관리 시스템', In Proceedings of the HCI 2007, pp. 475-480, 2007
  10. K. Jokinen, M. Kaipainen, T. Jauhuainen, G. Wilcock, M. Turunen, J. Akulinen, J. Kussis, K. Lagu, 'Adaptive Dialogue System Interaction with interact,' In Proceedings of the 3rd SIGdial Workshop on Discourse and Dialogue, Philadelphia, PA, pp. 64-73, 2002
  11. M. A. Walker, J. Wright, I. Langkilde, 'Using natural Language Processing and Discourse features to identify understanding errors in a spoken dialogue system,' In Proceedings of the 17th International Conference on Machine Learning, Palo Alto, CA. pp. 1111-1118, 2000
  12. R. S. Sutton, A. G. Barto, Reinforcement Learning An Introduction, MIT Press, 1998