DOI QR코드

DOI QR Code

Definition Sentences Recognition Based on Definition Centroid

  • Published : 2007.12.25

Abstract

This paper is concerned with the problem of recognizing definition sentences. Given a definition question like "Who is the person X?", we are to retrieve the definition sentences which capture descriptive information correspond variously to a person's age, occupation, of some role a person played in an event from the collection of news articles. In order to retrieve as many relevant sentences for the definition question as possible, we adopt a centroid based statistical approach which has been applied in summarization of multiple documents. To improve the precision and recall performance, the weight measure of centroid words is supplemented by using external knowledge resource such as Wikipedia and redundant candidate sentences are removed from candidate definitions. We see some improvements obtained by our approach over the baseline for 20 IT persons who have high document frequency.

Keywords

References

  1. E. M. Voorhees, 'Evaluating Answer to Definition Questions', Proceeding of HLT-NAACL, pp. 109 -111, 2003
  2. A. K. McCallum, 'BOW: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/~mccallum/bow, 1996
  3. W. Hilderbrandt, B. Katz and J. Lin, 'Answering definition questions using multiple knowledge sources', Proceedings of HLT/NAACI2004, Boston, MA, pp.49-56, 2004
  4. H. Cui, M. Y. Kan and T.S. Chua, 'Unsupervised learning of soft patterns for generating definitions from online news', Proceedings of the 13th World Wide Web conference, New York, pp. 90-99, 2004
  5. J. Xu, R. M. Weischedel and A. Licuanan, 'Evaluation of an extraction-based approach to answering definitional questions', Proceedings of SIGIR'04, Sheffield, UK, pp. 418-424, 2004
  6. N. Daniel, D. Radev and T. Allison, 'Sub-event based Multi-document Summarization', Proceedings of the HLT-NAACL 2003 Workshop on Text Summarization, pp. 9-16, 2003
  7. E. Filatova and V. Hatzivassiloglou, 'Event-based Extractive summarization', Proceedings of ACL 2004 Workshop on Summarization, pp. 104-111, 2004
  8. G. Salton, Automatic text Processing: The Transformation, Analysis and Retrieval of Information by Computer, Addison-Wesley, 1989
  9. D. Radev, H. ling and M. Budzikowska, 'Centroid based Summarization of Multiple Documents', Proceeding of ANLP/NAACL'00 Workshop on Automatic Summarization, Seattle, WA, pp. 21-29, 2000
  10. B. Schiffman, I. Mani and K. J. Conception, 'Producing biographical summaries: Combining linguistic knowledge with corpus statistics', Proceedings of European Association for Computational Linguistics, pp. 450-457, 2001
  11. U. Y. Nahm and R. J. Mooney, 'Mining soft matching rules form textual data', Proceedings of the 17th International Joint Conference on Artificial Intelligence, pp. 979-986, 2001
  12. http://ko.wikipedia.org
  13. http://www.etnews.co.kr/
  14. http://people.joins.com/
  15. C. Nobata, S. Sekine and H. Isahara, 'Evaluation of features for sentence extraction on different types of corpora', Proceedings of ACL 2003 workshop on multilingual summarization and question answering, Vol. 12, pp. 29-36, 2003