• Title/Summary/Keyword: Summarization

Search Result 373, Processing Time 0.033 seconds

Document Summarization using Topic Phrase Extraction and Query-based Summarization (주제어구 추출과 질의어 기반 요약을 이용한 문서 요약)

  • 한광록;오삼권;임기욱
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.488-497
    • /
    • 2004
  • This paper describes the hybrid document summarization using the indicative summarization and the query-based summarization. The learning models are built from teaming documents in order to extract topic phrases. We use Naive Bayesian, Decision Tree and Supported Vector Machine as the machine learning algorithm. The system extracts topic phrases automatically from new document based on these models and outputs the summary of the document using query-based summarization which considers the extracted topic phrases as queries and calculates the locality-based similarity of each topic phrase. We examine how the topic phrases affect the summarization and how many phrases are proper to summarization. Then, we evaluate the extracted summary by comparing with manual summary, and we also compare our summarization system with summarization mettled from MS-Word.

Summarization and Evaluation; Where are we today?!

  • Shamsfard, Mehrnoush;Saffarian, Amir;Ghodratnama, Samaneh
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.422-429
    • /
    • 2007
  • The rapid growth of the online information services causes the problem of information explosion. Automatic text summarization techniques are essential for dealing with this problem. There are different approaches to text summarization and different systems have used one or a combination of them. Considering the wide variety of summarization techniques there should be an evaluation mechanism to assess the process of summarization. The evaluation of automatic summarization is important and challenging, since in general it is difficult to agree on an ideal summary of a text. Currently evaluating summaries is a laborious task that could not be done simply by human so automatic evaluation techniques are appearing to help this matter. In this paper, we will take a look at summarization approaches and examine summarizers' general architecture. The importance of evaluation methods is discussed and the need to find better automatic systems to evaluate summaries is studied.

  • PDF

A Survey on Automatic Twitter Event Summarization

  • Rudrapal, Dwijen;Das, Amitava;Bhattacharya, Baby
    • Journal of Information Processing Systems
    • /
    • v.14 no.1
    • /
    • pp.79-100
    • /
    • 2018
  • Twitter is one of the most popular social platforms for online users to share trendy information and views on any event. Twitter reports an event faster than any other medium and contains enormous information and views regarding an event. Consequently, Twitter topic summarization is one of the most convenient ways to get instant gist of any event. However, the information shared on Twitter is often full of nonstandard abbreviations, acronyms, out of vocabulary (OOV) words and with grammatical mistakes which create challenges to find reliable and useful information related to any event. Undoubtedly, Twitter event summarization is a challenging task where traditional text summarization methods do not work well. In last decade, various research works introduced different approaches for automatic Twitter topic summarization. The main aim of this survey work is to make a broad overview of promising summarization approaches on a Twitter topic. We also focus on automatic evaluation of summarization techniques by surveying recent evaluation methodologies. At the end of the survey, we emphasize on both current and future research challenges in this domain through a level of depth analysis of the most recent summarization approaches.

Automatic Single Document Text Summarization Using Key Concepts in Documents

  • Sarkar, Kamal
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.602-620
    • /
    • 2013
  • Many previous research studies on extractive text summarization consider a subset of words in a document as keywords and use a sentence ranking function that ranks sentences based on their similarities with the list of extracted keywords. But the use of key concepts in automatic text summarization task has received less attention in literature on summarization. The proposed work uses key concepts identified from a document for creating a summary of the document. We view single-word or multi-word keyphrases of a document as the important concepts that a document elaborates on. Our work is based on the hypothesis that an extract is an elaboration of the important concepts to some permissible extent and it is controlled by the given summary length restriction. In other words, our method of text summarization chooses a subset of sentences from a document that maximizes the important concepts in the final summary. To allow diverse information in the summary, for each important concept, we select one sentence that is the best possible elaboration of the concept. Accordingly, the most important concept will contribute first to the summary, then to the second best concept, and so on. To prove the effectiveness of our proposed summarization method, we have compared it to some state-of-the art summarization systems and the results show that the proposed method outperforms the existing systems to which it is compared.

Automatic Summarization of French Scientific Articles by a Discourse Annotation Method using the EXCOM System

  • Antoine, Blais
    • Language and Information
    • /
    • v.13 no.1
    • /
    • pp.1-20
    • /
    • 2009
  • Summarization is a complex cognitive task and its simulation is very difficult for machines. This paper presents an automatic summarization strategy that is based on a discourse categorization of the textual information. This categorization is carried out by the automatic identification of discourse markers in texts. We defend here the use of discourse methods in automatic summarization. Two evaluations of the summarization strategy are presented. The summaries produced by our strategy are evaluated with summaries produced by humans and other applications. These two evaluations display well the capacity of our application, based on EXCOM, to produce summaries comparable to the summaries of other applications.

  • PDF

Improving Abstractive Summarization by Training Masked Out-of-Vocabulary Words

  • Lee, Tae-Seok;Lee, Hyun-Young;Kang, Seung-Shik
    • Journal of Information Processing Systems
    • /
    • v.18 no.3
    • /
    • pp.344-358
    • /
    • 2022
  • Text summarization is the task of producing a shorter version of a long document while accurately preserving the main contents of the original text. Abstractive summarization generates novel words and phrases using a language generation method through text transformation and prior-embedded word information. However, newly coined words or out-of-vocabulary words decrease the performance of automatic summarization because they are not pre-trained in the machine learning process. In this study, we demonstrated an improvement in summarization quality through the contextualized embedding of BERT with out-of-vocabulary masking. In addition, explicitly providing precise pointing and an optional copy instruction along with BERT embedding, we achieved an increased accuracy than the baseline model. The recall-based word-generation metric ROUGE-1 score was 55.11 and the word-order-based ROUGE-L score was 39.65.

An Automatic Summarization System of Baseball Game Video Using the Caption Information (자막 정보를 이용한 야구경기 비디오의 자동요약 시스템)

  • 유기원;허영식
    • Journal of Broadcast Engineering
    • /
    • v.7 no.2
    • /
    • pp.107-113
    • /
    • 2002
  • In this paper, we propose a method and a software system for automatic summarization of baseball game videos. The proposed system pursues fast execution and high accuracy of summarization. To satisfy the requirement, the detection of important events in baseball video is performed through DC-based shot boundary detection algorithm and simple caption recognition method. Furthermore, the proposed system supports a hierarchical description so that users can browse and navigate videos in several levels of summarization. In this paper, we propose a method and a software system for automatic summarization of baseball game videos. The proposed system pursues fast execution and high accuracy of summarization. To satisfy the requirement, the detection of important events in baseball video is performed through DC-based shot boundary detection algorithm and simple caption recognition method. Furthermore, the proposed system supports a hierarchical description so that users can browse and navigate videos in several levels of summarization.

An Experimental Study on Multi-Document Summarization for Question Answering (질의응답을 위한 복수문서 요약에 관한 실험적 연구)

  • Choi, Sang-Hee;Chung, Young-Mee
    • Journal of the Korean Society for information Management
    • /
    • v.21 no.3
    • /
    • pp.289-303
    • /
    • 2004
  • This experimental study proposes a multi-document summarization method that produces optimal summaries in which users can find answers to their queries. In order to identify the most effective method for this purpose, the performance of the three summarization methods were compared. The investigated methods are sentence clustering, passage extraction through spreading activation, and clustering-passage extraction hybrid methods. The effectiveness of each summarizing method was evaluated by two criteria used to measure the accuracy and the redundancy of a summary. The passage extraction method using the sequential bnb search algorithm proved to be most effective in summarizing multiple documents with regard to summarization precision. This study proposes the passage extraction method as the optimal multi-document summarization method.

Document Summarization Based on Sentence Clustering Using Graph Division (그래프 분할을 이용한 문장 클러스터링 기반 문서요약)

  • Lee Il-Joo;Kim Min-Koo
    • The KIPS Transactions:PartB
    • /
    • v.13B no.2 s.105
    • /
    • pp.149-154
    • /
    • 2006
  • The main purpose of document summarization is to reduce the complexity of documents that are consisted of sub-themes. Also it is to create summarization which includes the sub-themes. This paper proposes a summarization system which could extract any salient sentences in accordance with sub-themes by using graph division. A document can be represented in graphs by using chosen representative terms through term relativity analysis based on co-occurrence information. This graph, then, is subdivided to represent sub-themes through connected information. The divided graphs are types of sentence clustering which shows a close relationship. When salient sentences are extracted from the divided graphs, summarization consisted of core elements of sentences from the sub-themes can be produced. As a result, the summarization quality will be improved.

Viewer's Affective Feedback for Video Summarization

  • Dammak, Majdi;Wali, Ali;Alimi, Adel M.
    • Journal of Information Processing Systems
    • /
    • v.11 no.1
    • /
    • pp.76-94
    • /
    • 2015
  • For different reasons, many viewers like to watch a summary of films without having to waste their time. Traditionally, video film was analyzed manually to provide a summary of it, but this costs an important amount of work time. Therefore, it has become urgent to propose a tool for the automatic video summarization job. The automatic video summarization aims at extracting all of the important moments in which viewers might be interested. All summarization criteria can differ from one video to another. This paper presents how the emotional dimensions issued from real viewers can be used as an important input for computing which part is the most interesting in the total time of a film. Our results, which are based on lab experiments that were carried out, are significant and promising.