• Title/Summary/Keyword: paraphrases

Search Result 10, Processing Time 0.027 seconds

Automatic Extraction of Paraphrases from a Parallel Bible Corpus (정렬된 성경 코퍼스로부터 바꿔쓰기표현(paraphrase)의 자동 추출)

  • Lee, Kong-Joo;Yun, Bo-Hyun
    • Korean Journal of Cognitive Science
    • /
    • v.17 no.4
    • /
    • pp.323-336
    • /
    • 2006
  • In this paper, we present a pilot system that can extract paraphrases from a parallel corpus using to-training method. Paraphrases are useful for the applications that should rreate a varied ind fluent text, such as machine translation, question-answering system, and multidocument summarization system. One of the difficulties in extracting paraphrases is to find a rich source from which we can extract paraphrases. The bible is one of the good sources fur extracting paraphrases as it has several Korean versions in which every sentence can be easily aligned by the chapter and the verse. We ran extract not only the lexical-level paraphrases but also the phrasal-level paraphrases from the parallel corpus which consists of the bibles using co-training method.

  • PDF

Automatic Acquisition of Paraphrases Using Bilingual Dependency Relations

  • Hwang, Young-Sook;Kim, Young-Kil
    • ETRI Journal
    • /
    • v.30 no.1
    • /
    • pp.155-157
    • /
    • 2008
  • This letter introduces a new method to automatically acquire paraphrases using bilingual corpora. It utilizes the bilingual dependency relations obtained by projecting a monolingual dependency parse onto the other language's sentence based on statistical alignment techniques. Since the proposed paraphrasing method can clearly disambiguate the sense of the original phrases using the bilingual context of dependency relations, it would be possible to obtain interchangeable paraphrases under a given context. Through experiments with parallel corpora of Korean and English language pairs, we demonstrate that our method effectively extracts paraphrases with high precision, achieving success rates of 94.3% and 84.6%, respectively, for Korean and English.

  • PDF

Pivot Discrimination Approach for Paraphrase Extraction from Bilingual Corpus (이중 언어 기반 패러프레이즈 추출을 위한 피봇 차별화 방법)

  • Park, Esther;Lee, Hyoung-Gyu;Kim, Min-Jeong;Rim, Hae-Chang
    • Korean Journal of Cognitive Science
    • /
    • v.22 no.1
    • /
    • pp.57-78
    • /
    • 2011
  • Paraphrasing is the act of writing a text using other words without altering the meaning. Paraphrases can be used in many fields of natural language processing. In particular, paraphrases can be incorporated in machine translation in order to improve the coverage and the quality of translation. Recently, the approaches on paraphrase extraction utilize bilingual parallel corpora, which consist of aligned sentence pairs. In these approaches, paraphrases are identified, from the word alignment result, by pivot phrases which are the phrases in one language to which two or more phrases are connected in the other language. However, the word alignment is itself a very difficult task, so there can be many alignment errors. Moreover, the alignment errors can lead to the problem of selecting incorrect pivot phrases. In this study, we propose a method in paraphrase extraction that discriminates good pivot phrases from bad pivot phrases. Each pivot phrase is weighted according to its reliability, which is scored by considering the lexical and part-of-speech information. The experimental result shows that the proposed method achieves higher precision and recall of the paraphrase extraction than the baseline. Also, we show that the extracted paraphrases can increase the coverage of the Korean-English machine translation.

  • PDF

Detection of Similar Answers to Avoid Duplicate Question in Retrieval-based Automatic Question Generation (검색 기반의 질문생성에서 중복 방지를 위한 유사 응답 검출)

  • Choi, Yong-Seok;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.1
    • /
    • pp.27-36
    • /
    • 2019
  • In this paper, we propose a method to find the most similar answer to the user's response from the question-answer database in order to avoid generating a redundant question in retrieval-based automatic question generation system. As a question of the most similar answer to user's response may already be known to the user, the question should be removed from a set of question candidates. A similarity detector calculates a similarity between two answers by utilizing the same words, paraphrases, and sentential meanings. Paraphrases can be acquired by building a phrase table used in a statistical machine translation. A sentential meaning's similarity of two answers is calculated by an attention-based convolutional neural network. We evaluate the accuracy of the similarity detector on an evaluation set with 100 answers, and can get the 71% Mean Reciprocal Rank (MRR) score.

The Influence of English Proficiency and Text Types on Korean College Students' Paraphrasing for Plagiarism Prevention

  • Choe, Yoonhee
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.1
    • /
    • pp.183-189
    • /
    • 2021
  • This study examines the effects of Korean college students' English proficiency and the English text types on their paraphrases. Korean college students with three groups of English proficiency (high, mid, and low) read two types of English texts, causal texts, and argumentative texts, and paraphrased them in English. Students' paraphrase text was evaluated in terms of content (idea exposition, idea development, and wrap up), organization (coherence and cohesion) and language use (grammatical accuracy), and analyzed by MANOVA. As a result, it was found that there was a significant difference in their paraphrase performance according to the participants' English proficiency levels rather than the types of English texts. The results of this study have educational implications for English paraphrase education to prevent plagiarism for Korean university students.

Analysis of Sentential Paraphrase Patterns and Errors through Predicate-Argument Tuple-based Approximate Alignment (술어-논항 튜플 기반 근사 정렬을 이용한 문장 단위 바꿔쓰기표현 유형 및 오류 분석)

  • Choi, Sung-Pil;Song, Sa-Kwang;Myaeng, Sung-Hyon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.2
    • /
    • pp.135-148
    • /
    • 2012
  • This paper proposes a model for recognizing sentential paraphrases through Predicate-Argument Tuple (PAT)-based approximate alignment between two texts. We cast the paraphrase recognition problem as a binary classification by defining and applying various alignment features which could effectively express the semantic relatedness between two sentences. Experiment confirmed the potential of our approach and error analysis revealed various paraphrase patterns not being solved by our system, which can help us devise methods for further performance improvement.

Pivot Weighting Approach to Extract Korean Paraphrases (피봇 가중치 접근을 통한 한국어 패러프레이즈 추출)

  • Park, Esther;Lee, Hyoung-Gyu;Kim, Min-Jeong;Rim, Hae-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 2010.10a
    • /
    • pp.31-36
    • /
    • 2010
  • 이중 언어 병렬 말뭉치를 이용하는 패러프레이즈 추출 과정에서는 일반적으로 다른 언어를 피봇 언어로 하여 단어 및 구 정렬 과정을 두 번 거친다. 따라서 단어 정렬의 오류 전파 문제가 큰 단점이 된다. 특히 한국어와 영어와 같이 언어의 구조적인 차이가 큰 경우, 단어 정렬 오류가 더 많고 이로 인해 잘못된 피봇 프레이즈가 선정되는 문제가 더욱 심각하다. 이런 문제를 보완하기 위해, 본 논문에서는 패러프레이즈 추출 과정에서 피봇 프레이즈를 차별화하는 방안으로서, 올바른 피봇 프레이즈에 더 높은 가중치를 부여 하는 방법을 제안한다. 실험 결과, 기존의 패러프레이즈 추출 방법에 제안하는 피봇 가중치 부여 방법을 추가적으로 적용했을 때, 패러프레이즈 추출 정확률과 재현율이 모두 향상됨을 확인할 수 있었다.

  • PDF

Modern Methods of Text Analysis as an Effective Way to Combat Plagiarism

  • Myronenko, Serhii;Myronenko, Yelyzaveta
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.8
    • /
    • pp.242-248
    • /
    • 2022
  • The article presents the analysis of modern methods of automatic comparison of original and unoriginal text to detect textual plagiarism. The study covers two types of plagiarism - literal, when plagiarists directly make exact copying of the text without changing anything, and intelligent, using more sophisticated techniques, which are harder to detect due to the text manipulation, like words and signs replacement. Standard techniques related to extrinsic detection are string-based, vector space and semantic-based. The first, most common and most successful target models for detecting literal plagiarism - N-gram and Vector Space are analyzed, and their advantages and disadvantages are evaluated. The most effective target models that allow detecting intelligent plagiarism, particularly identifying paraphrases by measuring the semantic similarity of short components of the text, are investigated. Models using neural network architecture and based on natural language sentence matching approaches such as Densely Interactive Inference Network (DIIN), Bilateral Multi-Perspective Matching (BiMPM) and Bidirectional Encoder Representations from Transformers (BERT) and its family of models are considered. The progress in improving plagiarism detection systems, techniques and related models is summarized. Relevant and urgent problems that remain unresolved in detecting intelligent plagiarism - effective recognition of unoriginal ideas and qualitatively paraphrased text - are outlined.

On Nominalist Paraphrase (유명론적 번역에 대하여)

  • Joo, Yo-Han
    • Korean Journal of Logic
    • /
    • v.14 no.1
    • /
    • pp.77-102
    • /
    • 2011
  • This paper is about the problems that Quine's criterion of ontological commitment creates for Nominalists. Quine's clear criterion of ontological commitment, summarized as "to be is to be the value of a variable", means that when we accept a sentence to be true, we are committed to the existence of things that must exist for the sentence to be true. The criterion causes problems for Nominalists. According to Quine's criterion, Nominalists who consider "Humility is a virtue" as true should accept the existence of the property, humility. However, Nominalists are reluctant to accept that properties such as humility exist, although they wish to accept what is meant by "humility is a virtue". The way out of this predicament is presenting a paraphrase which delivers what Nominalists wanted to say through the original sentence without ontological commitment to the property. Several attempts were made to paraphrase such sentences, only to fail. In this paper, successful paraphrases will be presented to cope with previously discussed difficulties. Beforehand, several issues involved in the Quine's criterion will be clarified. Also, Lewis's critical objection that we should give up the business of paraphrase will be discussed.

  • PDF

A Polyphonic Approach to French Proverbs and the Readings of the Combination ′Opinion Verb + Proverb′ (다성적 관점에서 본 프랑스어 속담과 ′의견동사+속담′ 구문의 해독)

  • 황경자
    • Lingua Humanitatis
    • /
    • v.1 no.1
    • /
    • pp.275-294
    • /
    • 2001
  • This article aims to define the nature of proverbs from a polyphonic point of view and examine different readings of the complement involved in the combination of a proverb with a verb of personal opinion. An utterer of a proverb is not himself the author of the proverb. He may well be a 'speaker' of a proverb, but from a polyphonic view point he is not an 'enunciator' of the principle that underlies it. When we say that a speaker of a proverb is not its enunciator, we do not simply mean that he is not the author of the 'content' of the proverb he speaks: we mean that he is not the author of its 'form' either. The fact that a proverb loses its proverbial character when one paraphrases it proves that its form is not at the speaker's disposal. But a single factor cannot be held responsible for what a proverb is. As an indicator of the 'wisdom of the nation,' or vox populi, a proverb is the achievement of the 'collective enunciator.' The polyphony inherent in the proverb pits a particular speaker against a collective enunciator. This collective character of the proverb as a vox populi comes from its character as a phrasal denomination. Given that a proverb reflects a collective judgment and not a personal opinion, how do we interpret the combination of a proverb with a verb of personal opinion such as I think that ...\ulcorner Such a combination gives rise to readings at distinct levels: two types of metalinguistic reading and a reading based on the content of the proverb. The first level of reading, being applicative in nature, can be local or general, depending on the speaker's opinion as to the applicability of the proverb to a situation, particular or general. These applicative readings always involve polyphonic dissociation between the speaker and the enunciator. The second level of reading, which depends on the content of the proverb, is the result of the operation of deproverbialization, which makes the proverb lose its denominative status to preserve only its status as a generic phrase. The proverb, thus deproverbialized, looks like the series 'NP + VP.' For this reading, the speaker of the proverb takes into consideration the possibility of attributing a predicate to a nominal syntagm. Here occurs an identity between the speaker and the enunciator. It is not the case, however, that one can deproverbialize just any proverbs. In approaching to a locally typifying generic phrase, a proverb admits of being deproverbialized by an opinion verb only when its form does not render it difficult, either syntactically or metaphorically, to incorporate that proverb into the relevant combination, and when the proverb intrinsically possesses the traits that meet the conditions for the use of the opinion verb at hand. One can also maintain, based on the notion of deproverbialization, that a proverb expresses a collective judgment, a deproverbialized individual judgment.

  • PDF