• Title/Summary/Keyword: 자동발췌문

Search Result 5, Processing Time 0.022 seconds

A Study on the Construction of the Automatic Extracts and Summaries - On the Basis of Scientific Journal Articles - (자동 발췌문/요약 시스템 구축에 관한 연구 - 학술지 논문기사를 중심으로 -)

  • Lee Tae-Young
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.39 no.3
    • /
    • pp.139-163
    • /
    • 2005
  • Various corpus-based approaches, rhetorical roles of discourse structure, and unifications of similar sentences were applied to construct the automatic Ext/Sums(extracts and summaries). Rhetorical roles of sentences like objective, method, background, result, conclusion, etc. for making elastic Ext/Sums were established and extraction engines according to respective role were prepared. The $90\%$ of Success rate in extracting the important sentences of sample articles was accomplished. Rearranging the selected sentences, it used unification of similar sentences using the cosine coefficient equation, deletion of unnecessary modification and insertion clauses, junction of short sentences, and connection of sentences able to link. They suggest the methods applying rhetorical roles of sentences, meaning and signature of noun and verb in clauses, and cue words and location will be researched to construct the more effective Ext/Sums.

A Study on Automatically Constructing a Critical Abstracts of the Articles in Scholar Journals (학술잡지기사 초록의 비평문장 자동작성에 대한 연구)

  • Lee, Tae-Young
    • Journal of the Korean Society for information Management
    • /
    • v.25 no.1
    • /
    • pp.19-41
    • /
    • 2008
  • Cue words and phrases of critical sentences, Paradigms knowing the critical information between sentences, Rules extracting the sentences contained critical information and producing the critical sentences were made to construct the critical abstract of scholar journals in the web environment. The ontology aided to accomplish above works were designed, to manage and operate the cue words and phrases in documents and the symptoms related to Purpose, Method, Result, and Conclusion sentences. The results of performance test remarked to improve the advancement of extraction and production rules and the reinforcement of ontology's relationship.

A Study on Speaker Adaptation of HMM in a Continous Speech Recognition System (HMM을 이용한 연속음성인식 시스템의 화자적응화에 관한 연구)

  • 김상범
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.100-104
    • /
    • 1995
  • 일반적으로 화자적응화는 이미 학습되어 있는 불특정 화자 모델을 표준모델로 하고 소량의 적응화용 발화로 추가적인 학습을 실시하여 특정화자 모델의 성능에 가깝게 하는 기술로서 연속음성 인식에 있어서 매우 중요하다. ML 추정법을 이용한 화자적응화는 카테고리마다 모델의 학습패턴들을 다수개 준비한 후 학습시에 일괄적으로 적용시켜 모델 파라메터를 추정 갱신하므로 추가되는 화자데이터에 대해 데이터를 모두 공급하여야 한다. 본 연구에서는 문발화 데이터의 음절단위를 자동추출한 후 추가되는 화자데이터가 주어질 때 마다 적응화할 수 있는 화자적응화 방법을 검토하였다. 이 방법은 문발화 데이터를 잘라내지 않고 음절 단위를 자동추출시켜 추가 데이터마다 최대 사후확률 추정법을 이용하여 적응화 시키는 것으로 수소의 데이터로서도 적응화를 가능하게 하는 것이다. 본 연구에서 사용되는 음성데이터는 신문사설에서 발췌한 연속음성 10문장을 사용하고, 이 음성 데이터중 6명분은 HMM 학습용으로 하고 나머지 3명분은 적응화용 및 평가용 데이터로 사용하였다. 6명의 화자를 DDCHMM으로 학습하고 나머지 3명분을 MAP법으로 적응화시켰다. 그 결과 적응전과 비교해 볼 때 약 32%의 인식율 향상을 얻을 수 있었다.

  • PDF

A Study on the Construction of the Automatic Summaries - on the basis of Straight News in the Web - (자동요약시스템 구축에 대한 연구 - 웹 상의 보도기사를 중심으로 -)

  • Lee, Tae-Young
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.4 s.62
    • /
    • pp.41-67
    • /
    • 2006
  • The writings frame and various rules based on discourse structure and knowledge-based methods were applied to construct the automatic Ext/sums (extracts & summaries) system from the straight news in web. The frame contains the slot and facet represented by the role of paragraphs, sentences , and clauses in news and the rules determining the type of slot. Rearrangement like Unification, separation, and synthesis of the candidate sentences to summary, maintaining the coherence of meanings, was carried out by using the rules derived from similar degree measurement, syntactic information, discourse structure, and knowledge-based methods and the context plots defined with the syntactic/semantic signature of noun and verb and category of verb suffix. The critic sentence were tried to insert into summary.

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

  • Ko, Eunjung;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.125-148
    • /
    • 2018
  • Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.