DOI QR코드

DOI QR Code

A Study on the Construction of the Automatic Summaries - on the basis of Straight News in the Web -

자동요약시스템 구축에 대한 연구 - 웹 상의 보도기사를 중심으로 -

  • 이태영 (전북대학교 인문대학 문헌정보학과)
  • Published : 2006.12.29

Abstract

The writings frame and various rules based on discourse structure and knowledge-based methods were applied to construct the automatic Ext/sums (extracts & summaries) system from the straight news in web. The frame contains the slot and facet represented by the role of paragraphs, sentences , and clauses in news and the rules determining the type of slot. Rearrangement like Unification, separation, and synthesis of the candidate sentences to summary, maintaining the coherence of meanings, was carried out by using the rules derived from similar degree measurement, syntactic information, discourse structure, and knowledge-based methods and the context plots defined with the syntactic/semantic signature of noun and verb and category of verb suffix. The critic sentence were tried to insert into summary.

웹의 보도기사에 관한 자동요약시스템을 구축하기 위하여 담화구조와 지식기반 기법을 적용한 글구조 프레임과 제 규칙들을 작성하였다. 프레임에는 문단과 문장 및 절의 역할, 문단과 문장의 성질, 역할을 구분하는 판별규칙, 주요문장 발췌규칙, 그리고 요약문 작성규칙 슬롯이 포함되었다. 문맥정의, 고유명사 등을 안내하는 'if-needed'와 변화된 슬롯 값을 알려주는 if-changed 패싯도 구비되었다. 슬롯이나 패싯의 실제 값들을 추출 표현하는 과정에서 문구의 수사적 역할과 단어 최상위 범주 및 줄거리 단위를 참조하였다. 의미흐름의 연결성을 유지하면서 요약 문장들을 통합, 분리, 합성하는 재구성은 유사도공식, 구문정보, 담화구조와 지식기반 방법에서 도출한 제 규칙 및 문맥정의를 이용하였고 비평과 같은 새로운 문장을 생성하였다.

Keywords

References

  1. 이삼형. 1994. '설명적 텍스트의 내용 구조 분석방법과 교육적 적용 연구.', 박사학위논문, 서울대학교 대학원
  2. 이태영. 2005. '자동 발췌문/요약 시스템 구축에 관한 연구-학술지 논문기사를 중심으로', '한국문헌정보학회지', 39(3) : 139-163
  3. Allen, E.S., J.M. Burke, M.E. Welch & L. H. Rieseberg. 1999. 'How reliable is science information on the Web?' Nature, 402, 722. quoted in E. T. Jepsen, P. Seiden, P. Ingwersen, & L. Bjorneborn. 2004. 'Characteristics of Scientific Web Publications : Preliminary Data Gathering and Analysis', JASIST. 55(14) : 1239-1249 https://doi.org/10.1002/asi.20079
  4. Boguraev, and C. Kennedy. 1999. 'Salience-based Content Characterization of Text Documents', quoted in I. Mani and M.T. Maybury(eds.). 1999, Advanced in Automatic Text Summarization. Cambridge, Massachusetts : the MIT Press
  5. Drott, M. 2002. 'Indexing aids at corporate Websites : The use of robots. txt and META tags.' Information Processing and Management, 38 : 209-219. quoted in E.T. Jepsen, P. Seiden, P. Ingwersen, & L. Bjorneborn. 2004. 'Characteristics of Scientific Web Publications: Preliminary Data Gathering and Analysis', JASIST. 55(14) : 1239-1249
  6. Hovy, E. and C. Lin. 1999. 'Automated Text Summarization in SUMMARIST.' quoted in I. Mani and M.T. Maybury(eds.). 1999. Advanced in Automatic Text Summarization. Cambridge, Massachusetts : the MIT Press
  7. Jepsen, E. T., P. Seiden, P. Ingwersen, & L. Bjorneborn. 2004. 'Characteristics of Scientific Web Publications : Preliminary Data Gathering and Analysis', JASIST. 55(14) : 1239-1249 https://doi.org/10.1002/asi.20079
  8. Jones, K. S. 1999. 'Automatic summarizing : factors and directions.' quoted in I. Mani and M. T. Maybury(eds.). 1999. Advanced in Automatic Text Summarization. Cambridge, Massachusetts : the MIT Press
  9. Lawrence, S. & C.L. Giles. 1999. 'Accessiblity of information on the Web. Nature', 400, 107-109. quoted in E.T. Jepsen, P. Seiden, P. Ingwersen, & L. Bjorneborn. 2004. 'Characteristics of ScientificWeb Publications : Preliminary Data Gathering and Analysis.' JASIST, 55(14) : 1239-1249 https://doi.org/10.1002/asi.20079
  10. Lawrence, S., K. Bollacker, & C. L. Giles. 1999. 'Indexing and retrieval of scientific literature', quoted in E.T. Jepsen, P. Seiden, P. Ingwersen, & L. Bjorneborn. 2004. 'Characteristics of Scientific Web Publications : Preliminary Data Gathering and Analysis.' JASIST, 55(14) : 1239-1249 https://doi.org/10.1002/asi.20079
  11. Lehnert W. G., 1999. 'Plot Unit: A Narrative Summarization Strategy', quoted in I. Mani and M.T. Maybury(eds.). 1999. Advanced in Automatic Text Summarization. Cambridge, Massachusetts : the MIT Press
  12. Mani, I and M.T. Maybury(eds.). 1999. Advanced in Automatic Text Summarization, Cambridge, Massachusetts : the MIT Press
  13. Mani, I. 2001. Automatic Summarization, Amsterdam : John Benjamins Publishing Company
  14. Mann W. & S. Thompson. 1988. 'Rhetorical structure theory : toward a functional theory of text organization.' Text 8(3) : 243-281 quoted in D. Marcu. 1999. 'Discourse trees are good indicators of importance in text.' quoted in I. Mani and M.T. Maybury(eds.). 1999. Advanced in Automatic Text Summarization. Cambridge, Massachusetts : the MIT Press
  15. Marcu, D. 1999. 'Discourse trees are good Indicators of importance in text.' quoted in I. Mani. 2001. Automatic Summarization. Amsterdam : John Benjamins Publishing Company
  16. Moens, M-F., C. Uyttendaele, and J. Dumortier. 1999. 'Abstracting of Legal Cases : The Pontential of Clustering Based on the Selection of Representative Objects.' JASIS, 50 : 151-161 https://doi.org/10.1002/(SICI)1097-4571(1999)50:2<151::AID-ASI6>3.0.CO;2-I
  17. Moens, M-F. 2000. Automatic Indexing and Abstracting of Document Texts, Boston : Kluwer Academic Publishers
  18. Radev, D. R. and K. R. Mckeown. 1998. 'Generating Natural Language Summaries from Mutiple Online Sources.' Computational Linguistics, 24 : 469-500
  19. Salton, G., J. Allen, and A. Singhal. 1996. 'Automatic text decomposition and structuring.' Information Processing & Management, 32 : 127-138 https://doi.org/10.1016/S0306-4573(96)85001-1
  20. Strzalkowski, T., G. Stein, J. Wang, B. Wise. 1999. 'A Robust Practical Text Summarizer', quoted in I. Mani and M. T. Maybury(eds.). 1999. Advanced in Automatic Text Summarization. Cambridge, Massa chusetts : the MIT Press
  21. Schutze, H. 1998. 'Automatic word sense discrimination.' Computational Linguistics, 24 : 97-123
  22. Talja, S. 2005. 'The Social and Discourse Construction of Computing Skills.' JASIST, 56(1) : 13-22 https://doi.org/10.1002/asi.20091
  23. Teufel , S. and M. Moens. 1999. 'Argumentive classification of extracted sentences as a first step towards flexible abstracting.' quoted in I. Mani and M. T. Maybury(eds.) 1999. Advanced in Automatic Text Summarization. Cambridge, Massachusetts : the MIT Press