Browse > Article
http://dx.doi.org/10.3745/KTSDE.2019.8.1.27

Detection of Similar Answers to Avoid Duplicate Question in Retrieval-based Automatic Question Generation  

Choi, Yong-Seok (충남대학교 전자전파정보통신공학과)
Lee, Kong Joo (충남대학교 전파정보통신공학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.8, no.1, 2019 , pp. 27-36 More about this Journal
Abstract
In this paper, we propose a method to find the most similar answer to the user's response from the question-answer database in order to avoid generating a redundant question in retrieval-based automatic question generation system. As a question of the most similar answer to user's response may already be known to the user, the question should be removed from a set of question candidates. A similarity detector calculates a similarity between two answers by utilizing the same words, paraphrases, and sentential meanings. Paraphrases can be acquired by building a phrase table used in a statistical machine translation. A sentential meaning's similarity of two answers is calculated by an attention-based convolutional neural network. We evaluate the accuracy of the similarity detector on an evaluation set with 100 answers, and can get the 71% Mean Reciprocal Rank (MRR) score.
Keywords
Question-Answer Database; Automatic Question Generation; Similar Answer Detection;
Citations & Related Records
연도 인용수 순위
  • Reference
1 N. Duan, D. Tang, P. Chen, and M. Zhou, "Question generation for question answering," In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 877-885, 2017.
2 J. Mueller and A. Thyagarajan, "Siamese Recurrent Architectures for Learning Sentence Similarity," In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), pp. 2786-2792, 2016.
3 J. Allan, C. Wade, and A. Bolivar, "Retrieval and novelty detection at the sentence level," In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval. SIGIR '03, pp. 314-321, 2003.
4 T. C. Hoad and J. Zobel, "Methods for identifying versioned and plagiarized documents," In: Journal of the American Society for Information Science and Technology Archive, Vol. 54, Issue 3, pp. 203-215, 2003.   DOI
5 W. N. Zhang, T. Liu, Y. Yang, L. Cao, Y. Zhang, and R. Ji, "A Topic Clustering Approach to Finding Similar Questions from Large Question and Answer Archives," PloS one, Vol. 9, No. 3, e71511, 2014.   DOI
6 K. Wang, Z. Ming, and T. S. Chua, "A syntactic tree matching approach to finding similar questions in community-based QA services," In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR '09, pp. 187-194, 2009.
7 M. Marelli, L. Bentivogli, M. Baroni, R. Bernardi, S. Menini, and R. Zamparelli, "SemEval-2014 Task 1: Evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment," In Proceedings of the 8th International Workshop on Semantic Evaluation, pp. 1-8, 2014.
8 K. S. Tai, R. Socher, and C. D. Manning, "Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks," ACL, pp. 1556-1566, 2015.
9 Z. Yan, N. Duan, J. Bao, P. Chen, M. Zhou, Z. Li, and J. Zhou, "Docchat: An information retrieval approach for chatbot engines using unstructured documents," In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 516-525, 2016.
10 C. D. Manning, H. Schutze, and P. Raghavan, "Introduction to information retrieval," Cambridge University Press, 2008.
11 C. Callison-Burch, P. Callison-Burch, and M. Osborne, "Improved statistical machine translation using paraphrases," In Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 17-24, 2006.
12 R. Zens and H. Ney, "Efficient Phrase-table Representation for Machine Translation with Applications to Online MT and Speech Translation," Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLTNAACL), pp. 492-499, 2007.
13 W. Yin, H. Schutze, B. Xiang, and B. Zhou, "AbCNN: Attention-based convolutional neural network for modeling sentence pairs," arXiv preprint arXiv:1512.05193, 2015.
14 K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, "BLEU, a Method for Automatic Evaluation of Machine Translation," In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311-318, 2002.
15 N. T. Le, T. Kojiri, and N. Pinkwart, "Automatic Question Generation for Educational Applications - The State of Art," In Advanced Computational Methods for Knowledge Engineering. Springer, Cham, pp. 325-338, 2014.
16 Z. Ji, Z. Lu, and H. Li, "An information retrieval approach to short text conversation," arXiv preprint arXiv:1408.6988. 2014.
17 J. F. Aquino, D. D. Chua, R. K. Kabiling, J. N. Pingco and R. Sagum, "Text2Test: Question Generator Utilizing Information Abstraction Techniques and Question Generation Methods for Narrative and Declarative Text," In Proceedings of the 8th National Natural Language Processing Research Symposium, pp. 29-34, 2011.
18 P. Pabitha, M. Mohana, S. Suganthi, and B. Sivanandhini, "Automatic Question Generation System," In International Conference on Recent Trends in Information Technology, 2014.
19 P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, "Squad: 100,000+ questions for machine comprehension of text," In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Austin, Texas, pp. 2383-2392, 2016.
20 X. Du, J. Shao, and C. Cardie, "Learning to Ask: Neural Question Generation for Reading Comprehension," arXiv preprint arXiv:1705.00106, 2017.