[KSCI] Korea Science Citation Index Service

Multi-document Summarization Based on Cluster using Term Co-occurrence

Lee, Il-Joo (동원대학 모바일컨텐츠과)
Kim, Min-Koo (아주대학교 컴퓨터공학과)

Publication Information

Journal of KIISE:Software and Applications / v.33, no.2, 2006 , pp. 243-251 More about this Journal

Abstract

In multi-document summarization by means of salient sentence extraction, it is important to remove redundant information. In the removal process, the similarities and differences of sentences are considered. In this paper, we propose a method for multi-document summarization which extracts salient sentences without having redundant sentences by way of cohesive term clustering method that utilizes co-occurrence Information. In the cohesive term clustering method, we assume that each term does not exist independently, but rather it is related to each other in meanings. To find the relations between terms, we cluster sentences according to topics and use the co-occurrence information oi terms in the same topic. We conduct experimental tests with the DUC(Document Understanding Conferences) data. In the tests, our method shows better performance of summarization than other summarization methods which use term co-occurrence information based on term cohesion of document or sentence unit, and simple statistical information.

Keywords

multi-document summarization; co-occurrence information; cohesion;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	Salton.G., Singhal.A., Mitra.M. and Buckly.C., 'Automatic text structuring and summarization: Information Processing and Management, Vol. 33, no.2, 1997 DOI ScienceOn
2	Lin, Chin-Yew and E.H. Hovy., Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics, In Proceedings of 2003 Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada, May 27-June 1, 2003 DOI
3	Buckley,C.,Singhai,A.,Mitra, M. and Salton, G.,:'New retrieval approaches using SMART:TREC4', Proceedigs of the Forth Text Conference(TREC-4), pp. 25-48, 1996
4	Julian Kupiec, Jan Pedersen, and Francine Chen, 'A Trainable Document Summarizer,' In Proceedings of ACM-SIGIR'95, pp.68-73,1995 DOI
5	Mani and Inderjeet, Automatic Summarization, Amsterdam:John Benjamina Publishing Co. 2001
6	박성배, 장병탁, 'Co-Trained Support Vector Machines을 이용한 문서분류,' 한국정보과학회 봄 학술발표 논문집 (B), 제29권 1호, pp.259-261, 2002 과학기술학회마을
7	Barzilay, Regina and Michael Elhadad, 'Lexical Chains for Text Summarization,' Master's thesis, Ben-Gurion University, 1997
8	장두성, 최기선, '단서 구문과 어휘 쌍 확률을 이용한 인과관계 추출', 제 15회 한글 및 한국어 정보처리 학술대회, 2003
9	C. J. van Rijsbergen., 'A Theoritical Basis for the Use of Co-occurrence Data in Information Retrieval,' Journal of Documentation, Vol.33:106-119, 1977 DOI ScienceOn
10	Salton.G., Automatic text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley, 1989
11	Sparck Jones, K., 'Automatic summarizing:factors and directions,' In Mani and Maybury, (eds), Advances in Automatic Text Summarization, pp. 1-12. The MIT Press. 1999
12	Morris. A.H., Kasper and G.M, Adams. D.A., 'The effects and limitations of automated text condensing on reading comprehension performance,' Information systems Research, pp. 17. 35, March 1992 DOI
13	김재훈, 김준홍, '도합유사도를 이용한 한국어 문서요약 시스템', 한국 인지과학회 논문지 제12권 제1.2호, pp.35-42, 2001 과학기술학회마을
14	http://www-nlpir.nist.gov/projects/duc/index.html
15	http://www.isi.edu/~cyl/ROUGE/

KSCI

Multi-document Summarization Based on Cluster using Term Co-occurrence 단어의 공기정보를 이용한 클러스터 기반 다중문서 요약

Multi-document Summarization Based on Cluster using Term Co-occurrence