Browse > Article
http://dx.doi.org/10.5391/JKIIS.2007.17.6.813

Definition Sentences Recognition Based on Definition Centroid  

Kim, Kweon-Yang (School of Computer Engineering, Kyungil University)
Publication Information
Journal of the Korean Institute of Intelligent Systems / v.17, no.6, 2007 , pp. 813-818 More about this Journal
Abstract
This paper is concerned with the problem of recognizing definition sentences. Given a definition question like "Who is the person X?", we are to retrieve the definition sentences which capture descriptive information correspond variously to a person's age, occupation, of some role a person played in an event from the collection of news articles. In order to retrieve as many relevant sentences for the definition question as possible, we adopt a centroid based statistical approach which has been applied in summarization of multiple documents. To improve the precision and recall performance, the weight measure of centroid words is supplemented by using external knowledge resource such as Wikipedia and redundant candidate sentences are removed from candidate definitions. We see some improvements obtained by our approach over the baseline for 20 IT persons who have high document frequency.
Keywords
Centroid vector; Centroid word; Definition sentence; Definition question;
Citations & Related Records
연도 인용수 순위
  • Reference
1 http://www.etnews.co.kr/
2 E. M. Voorhees, 'Evaluating Answer to Definition Questions', Proceeding of HLT-NAACL, pp. 109 -111, 2003
3 A. K. McCallum, 'BOW: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/~mccallum/bow, 1996
4 H. Cui, M. Y. Kan and T.S. Chua, 'Unsupervised learning of soft patterns for generating definitions from online news', Proceedings of the 13th World Wide Web conference, New York, pp. 90-99, 2004
5 E. Filatova and V. Hatzivassiloglou, 'Event-based Extractive summarization', Proceedings of ACL 2004 Workshop on Summarization, pp. 104-111, 2004
6 D. Radev, H. ling and M. Budzikowska, 'Centroid based Summarization of Multiple Documents', Proceeding of ANLP/NAACL'00 Workshop on Automatic Summarization, Seattle, WA, pp. 21-29, 2000
7 http://ko.wikipedia.org
8 C. Nobata, S. Sekine and H. Isahara, 'Evaluation of features for sentence extraction on different types of corpora', Proceedings of ACL 2003 workshop on multilingual summarization and question answering, Vol. 12, pp. 29-36, 2003
9 B. Schiffman, I. Mani and K. J. Conception, 'Producing biographical summaries: Combining linguistic knowledge with corpus statistics', Proceedings of European Association for Computational Linguistics, pp. 450-457, 2001
10 W. Hilderbrandt, B. Katz and J. Lin, 'Answering definition questions using multiple knowledge sources', Proceedings of HLT/NAACI2004, Boston, MA, pp.49-56, 2004
11 J. Xu, R. M. Weischedel and A. Licuanan, 'Evaluation of an extraction-based approach to answering definitional questions', Proceedings of SIGIR'04, Sheffield, UK, pp. 418-424, 2004
12 N. Daniel, D. Radev and T. Allison, 'Sub-event based Multi-document Summarization', Proceedings of the HLT-NAACL 2003 Workshop on Text Summarization, pp. 9-16, 2003
13 http://people.joins.com/
14 U. Y. Nahm and R. J. Mooney, 'Mining soft matching rules form textual data', Proceedings of the 17th International Joint Conference on Artificial Intelligence, pp. 979-986, 2001
15 G. Salton, Automatic text Processing: The Transformation, Analysis and Retrieval of Information by Computer, Addison-Wesley, 1989