Analyses and Comparisons of Human and Statistic-based MMR Summarizations of Single Documents

단일 문서의 인위적 요약과 MMR 통계요약의 비교 및 분석

  • 유준현 (전북대학교 전자정보공학부) ;
  • 변동률 (전북대학교 전자정보공학) ;
  • 박순철 (전북대학교 전자정보공학부)
  • Published : 2004.03.01

Abstract

The Statistic-based method is widely used for automatic single document summarization in large sets of documents such as those on the web. However, the results of this method shows high redundancies in the summarized sentences because this method selects sentences including words that frequently appear in the document. We solve this problem using the method MMR to raise the quality of document summary (The best results are appeared around λ=0.6). Also, we compare the MMR summaries with those done by human subjects and verify their accuracy.

웹과 같은 대량의 문서집단에서 단일 문서에 대한 자동 요약은 일반적으로 통계요약 방법을 이용한다. 그러나 단순한 통계 요약 방법은 문서내의 빈도수가 높은 단어를 포함하는 문장들이 중복되어 나타날 확률이 높다. 이러한 단점을 보완하기 위하여 본 논문에서는 통계기반 요약방법에 MMR 기법을 적용하여 요약의 질을 향상시켰다(약 λ=0.6에서 최고의 성능을 보임). 또한 본 논문에서는 인위적 요약을 수행하여 MMR 통계기반의 요약 결과의 성능을 평가하였다.

Keywords

References

  1. 김영택 외 공저, 자연언어처리, 생능출판사
  2. 강상배, 한국어 문서의 통계적 정보를 이용한 문서요약 시스템 구현, 부산대학교, 전자계산학과, 석사 학위 논문
  3. A. Leuski and J. Allan. Improving interactive retrieval by combining ranked lists and clustering. In Proceedings of RIAO'2000, pages 665--681, April 2000
  4. Jaime Carbonell and Jade Goldstein, 'The use of MMR, diversity-based reranking for reordering documents and producing summaries,' in Proceedings of the 21st ACM-SIG ill International Conference on Research and Development in Information Retrieval, Melbourne, Australia, 1998 https://doi.org/10.1145/290941.291025
  5. http://nlp.kookmin.ac.kr/ 국민대학교 강승식 교수, 한국어 분석 모듈(HAM)
  6. Kathleen McKeown, Judith Klavans, Vasileios Hatzivassiloglou, Regina Barzilay, and Eleazar Eskin, Towards Multidocument Summarization by Reformulation: Progress and Prospects, In Proceedings of AAAI'99, Orlando, FL, July 1999
  7. W. Kraaij, M. Spitters, and M. van der Heijden. Combining a mixture language model and naive bayes for multi-document summarisation. In Working notes of the DUC2001
  8. Jade Goldstein, Mark Kantrowitz, Vibhu Mittal, and Jaime Carbonell, Summarizing Text Documents: Sentence Selection and Evaluation Metrics, In Proceedings of ACM-SIG ill' 99, Berkeley, CA, August 1999 https://doi.org/10.1145/312624.312665
  9. Inderjeet Mani and Eric Bloedorn, Summarizing Similarities and Differences Among Related Documents, Information Retrieval 1 (1-2), pages 35-67, June 1999 https://doi.org/10.1023/A:1009930203452
  10. Thrse Hand. A Proposal for Task-Based Evaluation of Text Summarization Systems, in Mani, I., and Maybury, M., eds., Proceedings of the ACL/EACL'97 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, July 1997
  11. Hongyanjing, Regina Barzilay, Kathleen Me Keown, and Michael Elhadad, Summarization Evaluation Methods: Experiments and Analysis, In Working Notes, AAAI Spring Symposium on Intelligent Text Summarization, Stanford, CA, April 1998
  12. Inderjeet Mani, David House, Gary Klein, Lynette Hirschman, Leo Orbst, Thrse Fir min, Michael Chrzanowski, and Beth Sundheirn. The TIPSTER SUMMAC text summarization evaluation. Technical Report MTR98W0000138, MITRE, McLean, Virginia, October 1998
  13. Inderjeet Mani;Mark Maybury, Advances in Automatic Text Summarization, MIT Press
  14. 오형진, 변동률, 이신원, '클러스터링 중심 결정 방법에 따른 문서클러스터링 성능 분석', 대한전자공학회, 2002
  15. 유준현, 변동률, 박순철, ;MMR, 클러스터링, 완전연결기법을 이용한 요약방법비교', 대한전자공학회, 2003