WiseQA를 위한 정답유형 인식

Recognition of Answer Type for WiseQA

  • 허정 (울산대학교 정보통신공학, 한국전자통신연구원) ;
  • 류법모 (한국전자통신연구원) ;
  • 김현기 (한국전자통신연구원) ;
  • 옥철영 (울산대학교 전기공학부 IT융합전공)
  • 투고 : 2015.02.13
  • 심사 : 2015.05.21
  • 발행 : 2015.07.31


본 논문에서는 WiseQA 시스템에서 정답유형을 인식하기 위한 하이브리드 방법을 제안한다. 정답유형은 어휘정답유형과 의미정답유형으로 구분된다. 본 논문은 어휘정답유형 인식을 위해서 질문초점에 기반한 규칙모델과 순차적 레이블링에 기반한 기계학습모델을 제안한다. 의미정답유형 인식을 위해 다중클래스 분류에 기반한 기계학습모델과 어휘정답유형을 이용한 필터링 규칙을 소개한다. 어휘정답유형 인식성능은 F1-score 82.47%이고, 의미정답유형 인식성능은 정확률 77.13%이다. 어휘정답유형 인식성능은 IBM 왓슨과 비교하여, 정확률은 1.0% 저조하고, 재현율은 7.4% 높다.

In this paper, we propose a hybrid method for the recognition of answer types in the WiseQA system. The answer types are classified into two categories: the lexical answer type (LAT) and the semantic answer type (SAT). This paper proposes two models for the LAT detection. One is a rule-based model using question focuses. The other is a machine learning model based on sequence labeling. We also propose two models for the SAT classification. They are a machine learning model based on multiclass classification and a filtering-rule model based on the lexical answer type. The performance of the LAT detection and the SAT classification shows F1-score of 82.47% and precision of 77.13%, respectively. Compared with IBM Watson for the performance of the LAT, the precision is 1.0% lower and the recall is 7.4% higher.



  1. John Burger, Claire Cardie, Vinay Chaudhri, Robert Gaizauskas, Sanda Harabagiu, David Israel, Christian Jacquemin, Chin-Yew Lin, Steve Maiorano, George Miller, Dan Moldovan, Bill Ogden, John Prager, Ellen Riloff, Amit Singhal, Rohini Shrihari, Tomek Strzalkowski, Ellen Voorhees, and Ralph Weishedel, "Issues, Tasks and Program Structures to Roadmap Research in Question & Answering (Q&A)," Document Understanding Conferences Roadmapping Documents, 2001.
  2. FERRUCCI, David A., "Introduction to "this is watson"," IBM Journal of Research and Development, Vol.56, No.3.4, pp. 1:1-1:15, 2012.
  3. Prager, J., Chu-Carroll, J., Czuba, K., Welty, C., Ittycheriah, A., & Mahindru, R., "IBM's PIQUANT in TREC2003," pp.283-292, TREC, 2003.
  4. Chu-Carroll, J., Czuba, K., Prager, J. M., Ittycheriah, A., and S, "IBM's PIQUANT II in TREC 2004," in TREC, 2004.
  5. Dan Moldovan, Sanda Harabagiu, Marius Pasca, Rada Mihalcea, Roxana Girju, Richard Goodrum, and Vasile Rus, "The Structure and Performance of an Open-Domain Question Answering System," in Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, 2000.
  6. Pasca, Marius A., and Sandra M. Harabagiu, "High performance question/answering," Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 2001.
  7. LALLY, Adam, et al. "Question analysis: How Watson reads a clue," IBM Journal of Research and Development, Vol.56, No.3.4, pp.2:1-2:14, 2012.
  8. Choe Ho-seop, "Construction Method of Large-scale 'Urimal(Korean)-Word Intelligent Network'," Hangul 273, pp.125-141, 2006(in Korean).
  9. Aesun Yoon, Soonhee Hwang, Eunryoung Lee, and Hyukchul Kwon, "Construction of Korean Wordnet KorLex 1.5," Journal of KIISE: Software and Applications, Vol.36, No.1, pp.92-108, Korea, 2009.
  10. C. Lee and M. Jang, "Named Entity Recognition with Structural SVMs and Pegasos algorithm," Korean Journal of Cognitive Science, Vol.21, No.4, pp.655-667, Korea, 2010.
  11. Jeong Heo, Pum-Mo Ryu, Myung-Gil Jang, and Hyun-Ki Kim, "Search Space Reduction and Answer Type Classification for Open Domain Q&A," Journal of KIISE: Software and Applications, Vol.39, No.2, pp.118-132, Korea, 2012.
  12. KALYANPUR, Aditya, et al. "Fact-based question decomposition in DeepQA," IBM Journal of Research and Development, Vol.56, No.3.4, pp.13:1-13:11, 2012.