Browse > Article

(A Question Type Classifier based on a Support Vector Machine for a Korean Question-Answering System)  

김학수 (서강대학교 컴퓨터학과 자연어처리 연구실)
안영훈 (서강대학교 컴퓨터학과 자연어처리 연구실)
서정연 (서강대학교 컴퓨터학과)
Abstract
To build an efficient Question-Answering (QA) system, a question type classifier is needed. It can classify user's queries into predefined categories regardless of the surface form of a question. In this paper, we propose a question type classifier using a Support Vector Machine (SVM). The question type classifier first extracts features like lexical forms, part of speech and semantic markers from a user's question. The system uses $X^2$ statistic to select important features. Selected features are represented as a vector. Finally, a SVM categorizes questions into predefined categories according to the extracted features. In the experiment, the proposed system accomplished 86.4% accuracy The system precisely classifies question type without using any rules like lexico-syntactic patterns. Therefore, the system is robust and easily portable to other domains.
Keywords
Question type classifier; Document categorization; Support Vector Machine;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Voorhees E. and Tice D. M., 'Building a Question Answering Test Collection', In Proceedings of SIGIR 2000, pp. 200-207, 2000   DOI
2 AAAI Fall Symposium on Question Answering, http://www.aaai.org/Press/Reports/Symposia/Fall/fs-99-02.html
3 TREC (Text REtrieval Conference) Overview, http://trec.nist.gov/overview.html
4 Moldovan D., Harabagiu S., Pasca M., Mihalcea R., Goodrum R., Girju R. and Rus V., 'LASSO: A Tool for Surfing the Answer Net', In Proceedings of The Eighth Text REtrieval Conference (TREC-8), from http://trec.nist.gov/pubs/trec8/t8_proceedings.html, 1999
5 Prager J., Radev D., Brown E. and Coden A., 'The Use of Predictive Annotation for Question Answering in TREC8', In Proceedings of The Eighth Text REtrieval Conference (TREC-8), from http://trec.nist.gov/pubs/trec8/t8_proceedings.html, 1999
6 O. Ferret, B. Grau, G. Illouz, and C. Jacquemin, 'QALC the Question-Answering program of the Language and Cognition group at LIMSI-CNRS', In Proceedings of the Eighth Text REtrieval Conference (TREC-8), http// trec.nist.gov/pubs/trec8/t8_proceedings.html, Gaithersburg, Maryland, 1999
7 Kupiec J., 'Murax: A Robust Linguistic Approach for Question Answering Using an On-line Encyclopedia', In Proceedings of SIGIR'93, 1993   DOI
8 Berri J., Molla D., and Hess M., 'Extraction automatique de reponses: implementations du systeme ExtrAns', In Proceedings of the fifth conference TALN 1998, pp. 10-12, 1998
9 Prager J., Brown E. and Coden A., 'Question-Answering by Predictive Annotation', In Proceedings of SIGIR 2000, pp. 184-191, 2000   DOI
10 Hermjakob U., 'Parsing and Question Classification for Question Answering', In Proceedings of the ACL Workshop Open-Domain Question Answering, pp. 17-22, 2001
11 Harabagiu S., Moldovan D., Pasca M., Mihalcea R., Surdeanu M., Bunescu R., Girju R., Rus V. and Morarescu P., 'FALCON: Boosting Knowledge for Answer Engines', In Proceedings of the Ninth Text REtrieval Conference, from http:// trec.nist.gov/pubs/trec9/t9_proceedings.html, 2000
12 Ittycheriah A., Franz M., Zhu W. and Ratnaparkhi A., 'Question Answering Using Maximum Entropy Components', In Proceedings of NAACL, 2001   DOI
13 Vicedo J. L. and Ferrandex A., 'Importance of Pronominal Anaphora resolution in Question Answering systems', In Proceeding of ACL 2000, pp. 555-562, 2000   DOI
14 Ittycheriah A., Franz M., Zhu W. and Ratnaparkhi A., 'IBM's Statistical Question Answering System', In Proceedings of the Ninth Text REtrieval Conference, http://trec.nist.gov/pubs/trec9/t9_pro-ceedings.html, Maryland, 2000
15 Mann G. S., 'A Statistical Method for Short Answer Extraction', In Proceedings of the ACL Workshop Open-Domain Question Answering, pp. 13-30, 2001
16 Miller G., WordNet: An on-line lexical database, International Journal of Lexicography, Vol. 3(4), 1990
17 U. Hermjakob and R. J. Mooney, 'Learning Parse and Translation Decisions From Examples With Rich Context', In Proceedings of the 35th ACL, pp. 482-489, 1997
18 Vapnik V., The Natural of Statistical Learning Theory, Springer, New York, 1995
19 diquest, http://www.diquest.com
20 Maarek Y., Berry D. and Kaiser G., An Information Retrieval Approach For Automatically Construction Software Libraries, IEEE Transaction On Software Engineering, Vol. 17, No. 8, pp.800-813, August 1991   DOI   ScienceOn
21 $SVM^{light}$, http://ais.gmd.de/~thorsten/svm_light
22 J. Weston and C. Watkins, 'Support vector machines for multiclass pattern recognition', In Proceedings of the 7th European Symposium On Artificial Neural Networks, April 1999
23 김학수, 안영훈, 서정연, 하이브리드 방법에 기반한 사용자 질의 의도 분류, 한국정보과학회 제30권 1,2호, pp. 51-57, 2003   과학기술학회마을
24 Y. Yang and J. O. Pederson, 'A comparative study on feature selection in text categorization', In Proceedings of the 14th International Conference on Machine Learning, 1997
25 Kim H., Kim K., Lee G. G. and Seo J., 'MAYA: A Fast Question-answering System Based On A Predictive Answer Indexer', In Proceedings of the ACL Workshop Open-Domain Question Answering, pp. 9-16, 2001