Browse > Article

Automatic Extractive Summarization of Newspaper Articles using Activation Degree of 5W1H  

윤재민 ((주) 얄리)
정유진 (포항공과대학교 컴퓨터공학)
이종혁 (포항공과대학교 컴퓨터공학과)
Abstract
In a newspaper, 5W1H information is the most fundamental and important element for writing and understanding articles. Focusing on such a relation between a newspaper article and the 5W1H, we propose a summarization method based on the activation degree of 5W1H. To overcome problems of the lead-based and the title-based methods, both of which are known to be the most effective in newspaper summarization, sufficient 5W1H information is extracted from both a title and a lead sentence. Moreover, for each sentence, its weight is computed by considering various factors, such as activation degree of 5W1H, the number of 5W1H categories, and its length and position. These factors make a great contribution to the selection of more important sentences, and thus to the improvement of readability of the summarized texts. In an experimental evaluation, the proposed method achieved a precision of 74.7% outperforming the lead-based method. In sum, our 5W1H approach was shown to be promising for automatic summarization of newspaper articles.
Keywords
newspaper articles; summarization; 5W1H; activation degree;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 /
[ Ohno,S.;Hamanishi,M. ] / New Synonym Dictionary
2 Teufel, S. and Moens, M. 'Sentence Extraction as a Classification Task,' In Proceedings of the ACL'97/EACL'97 Workshop on Intelligent Scalable Text Summarization, pp.58-65, 1997
3 Marcu, D., 'Building Up Rhetorical Structure Trees,' In Proceedings of the 13th National Conference on Artificial Intelligence, Vol.2, pp.1069-1074, 1996
4 Marcu, D., 'The Rhetorical Parsing of Natural Language Texts,' In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics(ACL'97;EACL'97), pp.96-103, 1997   DOI
5 Marcu, D., 'Discourse trees are good indicators of importance in text,' In Inderjeet Mani and Mark Maybury, eds, Advances in Automatic Text Summarization, pp.123-136, The MIT Press, 1999
6 Brazilay, R. and Elhadad, M., 'Using Lexical Chains for Text Summarization,' In Inderjeet Mani and Mark Maybury, eds, Advances in Automatic Text Summarization, pp.111-121, The MIT Press, 1999
7 Salton, G. and Singhal, A., 'Automatic Text Theme Generation and the Analysis of Text Structure,' Cornell U. Technical Report TR 94-1438, 1994
8 Salton, G. et al., 'Automatic Text Decomposition Using Text Segments and Text Themes,' '96 ACM Conference on Hypertext, pp.53-65, 1996   DOI
9 Salton, G. et al., 'Automatic Text Structuring and Summarization,' Information Processing and Management, Vol.33, No.2, pp.193-207, 1997   DOI   ScienceOn
10 Lin, C. Y. and Hovy, E., 'Identifying Topics by Position,' In Proceedings of the 5th Conference on Applied Natural Language Processing(ANLP'97), pp.283-290, 1997
11 이현주, 김계성, 구상욱, 이상조, '신문기사에서 육하원칙 중심의 정보추출', 한국정보과학회 춘계 학술발표 논문집, pp.361-363, 2001   과학기술학회마을
12 Brandow, R., Mitze, K. and Rau, L. F., 'Automatical condensation of electronic publications by sentence selection,' Information Processing and Management, Vol.31, No.5, pp.675-685, 1995   DOI   ScienceOn
13 고혜련, 신문 취재와 기사작성, 중앙M&B, 2001
14 Kupiec, J., Pedersen, J. and Chen, F., 'A Trainable Document Summarizer,' In Proceedings of ACM-SIGIR'95, pp.68-73, 1995   DOI
15 Okumura, A., Ikeda, T. and Muraki, K., 'Text Summarization based on Information Extraction and Categorization Using 5W1H,' Journal of Natural Language Processing, Vol.6, No.6, pp.27-44, 1999   DOI
16 Marcu, D., 'Improving Summarization through Rhetorical Parsing Tuning,' In Proceeding of the COLING ACL Workshop on Very Large Corpora, Montreal, Canada, 1998
17 김재훈, 김준흥, '도합유사도를 이용한 한국어 추출문서 요약', 제10회 한글 및 한국어 정보처리 학술발표 논문집, pp.238-244, 2000   과학기술학회마을
18 이행원, 취재보도의 실제, 나남출판, 1999
19 김지용, 현장신문론, 도서출판 쟁기, 1996
20 Hohenberg, J., The Professional Journalist, Henry Holt and Company Inc., New York, 1960
21 윤석흥, 김춘옥, 신문방송, 취재와 보도, 나남출판, 2000
22 Brooks, B. et al., The Missouri Group : News Reporting and Writing, St. Martin's Press, 1996
23 조용철 외, 취재와 기사작성, 도서출판 양지, 1999
24 국립국어연구원, 한국신문의 문체, 1997
25 Hovy, E. and Lin, C. Y., 'Automated Text Summarization in SUMMARlST,' In Inderjeet Mani and Mark Maybury, eds, Advances in Automatic Text Summarization, pp.81-94, The MIT Press, 1999
26 윤만근, Chomsky 생성문법의 변천, 경진문화사, 2001
27 Ohno, S. and Hamanishi, M., 'New Synonym Dictionary,' Kadokawa Shoten, Tokyo, 1981 (Written in japanese)
28 Edmundson, H. P., 'New Methods in Automatic Extracting,' Journal of the ACM, Vol.16, No.2, pp.264-285, 1969   DOI
29 Mani, I., Automatic summarization, John Benjamin Publishing Company, 2001