Sentence ion : Sentence Revision with Concept ion

Kim, Gon;Yang, Jaegun;Bae, Jaehak;Lee, Jonghyuk;

doi:10.3745/KIPSTB.2004.11B.5.563

The KIPS Transactions:PartB (정보처리학회논문지B)

Volume 11B Issue 5
/
Pages.563-572
/
2004
/
1598-284X(pISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Sentence ion : Sentence Revision with Concept ion

문장추상화 : 개념추상화를 도입한 문장교열

김곤 (울산대학교 대학원 컴퓨터.정보통신공학부) ;
양재곤 (울산대학교 대학원 컴퓨터.정보통신공학) ;
배재학 (울산대학교 컴퓨터.정보통신공학) ;
이종혁 (포항공과대학교 컴퓨터공학과)

Published : 2004.08.01

https://doi.org/10.3745/KIPSTB.2004.11B.5.563 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Sentence ion is a simplification of a sentence preserving its communicative function. It accomplishes sentence revision and concept ion simultaneously. Sentence revision is a method that resolves the discrepancy between human's thoughts and its expressed semantic in sentences. Concept ion is an expression of general ideas acquired from the common elements of concepts. Sentence ion selects the main constituents of given sentences and describes the upper concepts of them with detecting their semantic information. This enables sen fence revision and concept ion simultaneously. In this paper, a syntactic parser LGPI+ and an ontology OfN are utilized for sentence ion. Sentence abstracter SABOT makes use of LGPI+ and OfN. SABOT processes the result of parsing and selects the candidate words for sentence ion. This paper computes the sentence recall of the main sentences and the topic hit ratio of the selected sentences with the text understanding system using sentence ion. The sources are 58 paragraphs in 23 stories. As a result of it, the sentence recall is about .54 ～ 72% and the topic hit ratio is about 76 ～ 86%. This paper verified that sentence ion enables sentence revision that can select the topic sentences of a given text efficiently and concept ion that can improve the depth of text understanding.

문장추상화(Sentence Abstraction)는 문장의 의사전달 기능이 보존된 단순화이다. 이는 문장교열(Sentence Revision)과 개념추상화(Concept Abstraction)를 동시에 가능하게 한다. 문장교열은 사람이 생각한 바와 문장으로 표현된 의미의 차이를 해결하는 방법이다. 개념추상화는 개념들의 공통된 요소로부터 얻은 보편적인 관념을 표현하는 것이다. 문장추상화는 문장의 주요구성성분들을 선별해 내고, 이들의 의미적인 정보를 파악하여 상위개념을 표현함으로써 문장교열과 개념추상화를 가능하게 한다. 본 논문에서는 문장추상화를 위한 구문분석기 LGPI+와, 온톨러지 OfN을 구체화하였다. 문장추상기 SABOT는 LGPI+와 OfN을 활용하며, 구문분석 결과를 처리하여 문장에서 추상화 할 후보난어를 선택한다. 문장추상화를 활용한 원문이해 시스템으로 23개 이야기의 58개 문단에 대해 중요 문장에 대한 문장재현율과 선별된 문장들의 주제관련성을 확인해 보았다. 실험결과, 문장재현율은 54～72%의 범위이었고, 주제관련성은 76～86% 정도의 비율로 나타났다. 이를 유사 시스템과 비교해 보았을 때, 약 10～20% 정도의 성능향상을 보인다. 본 논문에서는 문장추상화를 활용하여 글의 화제문을 효율적으로 선택할 수 있는 문장교열과 원문의 이해심도를 보다 더 깊게 할 수 있는 개념추상화가 가능함을 확인하였다.

Keywords

References

김곤, 김민찬, 배재학, 이종혁, 'ISAAC : 문장분석용 통합시스템 및 사용자 인터페이스', 정보처리학회논문지B, 제11-B권 제1호, pp.107-116, 2004 https://doi.org/10.3745/KIPSTB.2004.11B.1.107
Bae, J.-H. and Lee, J.-H., 'Topic Sentence Selection with Mid-Depth Understanding', Proc. of ICCPOL, pp.199-204, 2001
Bae, J.-H. and Lee, J.-H., 'Mid-Depth Text Understanding by Abductive Chains for Topic Sentence Selection,' IJCPOL, Vol.15, No.3, pp. 341-357, 2002
Roget's Thesaurus, http://promo.net/cgi-promo/pg/t9.cgi?ftp://ibiblio.org/pub/docsftp://ibiblio.org/pub/docs/books/gutenberg/
양재군, 배재학, '온톨로지 정보를 이용한 범주 재편성: Roget 시소러스의 경', 정보처리학회 춘계학술발표대회 논문집, 제9권 제1호, pp.515-518, 2002
SWI-Prolog, http://www.swi-prolog.org/
배재학, '언어학적인 방법론을 취하는 자동 문서요약에 대한 연구', 울산대학교, 공학연구논문집, 제29권 제2호, pp.351-363, 1998
Mani, I., 'Automatic Summarization', John Benjamins Publishing Company, 2001
Marcu, D., 'From Discourse Structures to Text Summaries,' in Proc. ACL'97 and EACL'97 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 82-88, Jul., 1997
Barzilay, R. and Elhadad, M., 'Using Lexical Chains for Text Summarization,' in Proc. ISTS'97 (The Intelligent Scalable Text Summarization Workshop, ACL), Madrid, Spain, pp.10-17, Jul., 1997
Proper Names Wordlist. http://clr.nmsu.edu/cgi-bin/Tools/CLR/clrcat#I4
C. Fellbaum (ed.), WordNet : An Electronic Lexical Database, MIT Press, 1998
Bae, J.-H. J. and Lee, J.-H., 'Another Investigation of Automatic Text Summarization : A Reader-Oriented Approach,' In Proceedings of ANZIIS '94 (Australian and New Zealand Conference on Intelligent Information Systems), pp.472-476, 1994 https://doi.org/10.1109/ANZIIS.1994.397011
Souther, J. W. and White, M. L., 'Technical report writing,' second edition. Wiley-Interscience, John Wiley & Sons, New York, 1977
Luhn, H. P., 'The Automatic Creation of Literature Abstracts,' IBM Journal of Research and Development, Vol.2, No.2, pp.159-165, 1993 https://doi.org/10.1147/rd.22.0159
Huls, C., Bos, E., Claassen, W., 'Automatic Referent Resolution of Deictic and Anaphoric Expressions,' Computational Linguistics, Vol.21, No.1, pp.59-79, 1995
Jing, H., 'Sentence Reduction for Automatic Text Summarization.' In Proceedings of The 6th Applied Natural Language Processing Conference and the First Meetion of the North American Chapter of the Association for Computational Linguistics (ANLP-NAACL '2000), pp.310-315, 1999 https://doi.org/10.3115/974147.974190
Grefenstette, G., 'Producing Intelligent Telegraphic Text Reduction to Provide an Audio Scanning Service for the blind,' In Working Notes of the Workshop on Intelligent Text Summarization, pp.111-117, 1998
Knight, K. and Marcu, D., 'Statistics-Based Summarization - Step One : Sentence Compression,' The 17th National Conference of the American Association for Artificial Intelligence AAAI '2000, Outstanding Paper Award, Austin, Texas, July-August, 2000
Cremmins, E. T., 'The Art of Abstracting,' Information Resources Press, 1996
Robin, J., 'Revision-based generation of natural language summaries providing historical background : corpus-based analysis, design and implementation,' Ph.D. Thesis, Columbia University, 1994
DearAbby, http://www.dearabby.com/
Link Grammar, http://www.link.cs.cmu.edu/link

The KIPS Transactions:PartB (정보처리학회논문지B)

Sentence ion : Sentence Revision with Concept ion

문장추상화 : 개념추상화를 도입한 문장교열

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)