• Title/Summary/Keyword: Sentential Similarity

Search Result 2, Processing Time 0.021 seconds

Detection of Similar Answers to Avoid Duplicate Question in Retrieval-based Automatic Question Generation (검색 기반의 질문생성에서 중복 방지를 위한 유사 응답 검출)

  • Choi, Yong-Seok;Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.1
    • /
    • pp.27-36
    • /
    • 2019
  • In this paper, we propose a method to find the most similar answer to the user's response from the question-answer database in order to avoid generating a redundant question in retrieval-based automatic question generation system. As a question of the most similar answer to user's response may already be known to the user, the question should be removed from a set of question candidates. A similarity detector calculates a similarity between two answers by utilizing the same words, paraphrases, and sentential meanings. Paraphrases can be acquired by building a phrase table used in a statistical machine translation. A sentential meaning's similarity of two answers is calculated by an attention-based convolutional neural network. We evaluate the accuracy of the similarity detector on an evaluation set with 100 answers, and can get the 71% Mean Reciprocal Rank (MRR) score.

Using Collective Citing Sentences to Recognize Cited Text in Computational Linguistics Articles

  • Kang, In-Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.11
    • /
    • pp.85-91
    • /
    • 2016
  • This paper proposes a collective approach to cited text recognition by exploiting a set of citing text from different articles citing the same article. First, the proposed method gathers highly-ranked cited sentences from the cited article using a group of citing text to create a collective information of probable cited sentences. Then, such collective information is used to determine final cited sentences among highly-ranked sentences from similarity-based cited text recognition. Experiments have been conducted on the data set which consists of research articles from a computational linguistics domain. Evaluation results showed that the proposed method could improve the performance of similarity-based baseline approaches.