Browse > Article
http://dx.doi.org/10.3745/KIPSTB.2010.17B.4.327

Question Classification Based on Word Association for Question and Answer Archives  

Jin, Xueying (전북대학교 컴퓨터공학과)
Lee, Kyung-Soon (전북대학교 컴퓨터공학부/영상정보신기술연구센터)
Abstract
Word mismatch is the most significant problem that causes low performance in question classification, whose questions consist of only two or three words that expressed in many different ways. So, it is necessary to apply word association in question classification. In this paper, we propose question classification method using translation-based language model, which use word translation probabilities for question-question pair that is learned in the same category. In the experiment, we prove that translation probabilities of question-question pairs in the same category is more effective than question-answer pairs in total collection.
Keywords
Question Classification; Word Association; Translation-Based Language Model; Language Model; Feature Selection; Question and Answer Archives;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Berger and J. Lafferty, "Information retrieval as statistical translation," Proceedings of the 22nd annual international ACM SIGIR conference, pp.222-229, Aug., 1999.
2 Jiwoon Jeon, W. Bruce Croft and Joon Ho Lee, "Finding Similar Questions in Large Question and Answer Archives," Proceedings of the 14th ACM SIGIR Conference, pp.84-90, 2005.
3 Xiaobing Xue, Jiwoon Jeon and W. Bruce Croft, "Retrieval Models for Question and Answer Archives," Proceedings of the 31st annual international ACM SIGIR conference, pp.475-482, July, 2008.
4 Huanhuan Cao, Derek HaoHu, Dou Shen and Daxin Jiang, "Context-Aware Query Classification," Proceedings of the 32nd annual international ACM SIGIR conference, pp.3-10, July, 2009.
5 Dou Shen, Jian-Tao Sun, Qiang Yang and Zheng Chen, "Building Bridges for Web Query Classification," Proceedings of the 29th annual international ACM SIGIR conference, pp.131-138, Aug., 2006.
6 ODP, http://dmoz.org
7 Yu Jingbo and YeNa, "Automatic Web Query Classification Using Large Unlabeled Web Pages", Proceedings of the 2008 The Ninth International Conference, pp.211-215, 2008.
8 Dell Zhang, Wee Sun Lee, "Question Classification using Support Vector Machines", Proceedings of the 26th annual international ACM SIGIR conference, pp.26-32, 2003.
9 GIZA tool, http://www.fjoch.com/GIZA++.html
10 ChengXiang Zhai, John Lafferty, "A study of smoothing methods for language models applied to information retrieval", ACM Trans.Inf.Syst, Vol.22, No.2, pp.179-214, 2004.   DOI   ScienceOn
11 Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, Robert L. Mercer, "The Mathematics of Statistical Machine Translation: Parameter Estimation," Computational Linguistics 19, 2(1993), pp.263-311.
12 Yangdong Liu, Jiang Bian and Eugene Agichtein, "Predicting Information Seeker Satisfaction in Community Question Answering," Proceeding of the 31st Annual International ACM SIGIR Conference, pp.483-490, July, 2008.
13 KDDCUP 2005, http://www.acm.org/sigs/kddcup/