Browse > Article
http://dx.doi.org/10.13088/jiis.2019.25.3.161

Query-based Answer Extraction using Korean Dependency Parsing  

Lee, Dokyoung (Department of Industrial Engineering, Yonsei University)
Kim, Mintae (Department of Industrial Engineering, Yonsei University)
Kim, Wooju (Department of Industrial Engineering, Yonsei University)
Publication Information
Journal of Intelligence and Information Systems / v.25, no.3, 2019 , pp. 161-177 More about this Journal
Abstract
In this paper, we study the performance improvement of the answer extraction in Question-Answering system by using sentence dependency parsing result. The Question-Answering (QA) system consists of query analysis, which is a method of analyzing the user's query, and answer extraction, which is a method to extract appropriate answers in the document. And various studies have been conducted on two methods. In order to improve the performance of answer extraction, it is necessary to accurately reflect the grammatical information of sentences. In Korean, because word order structure is free and omission of sentence components is frequent, dependency parsing is a good way to analyze Korean syntax. Therefore, in this study, we improved the performance of the answer extraction by adding the features generated by dependency parsing analysis to the inputs of the answer extraction model (Bidirectional LSTM-CRF). The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. In this study, we compared the performance of the answer extraction model when inputting basic word features generated without the dependency parsing and the performance of the model when inputting the addition of the Eojeol tag feature and dependency graph embedding feature. Since dependency parsing is performed on a basic unit of an Eojeol, which is a component of sentences separated by a space, the tag information of the Eojeol can be obtained as a result of the dependency parsing. The Eojeol tag feature means the tag information of the Eojeol. The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. From the dependency parsing result, a graph is generated from the Eojeol to the node, the dependency between the Eojeol to the edge, and the Eojeol tag to the node label. In this process, an undirected graph is generated or a directed graph is generated according to whether or not the dependency relation direction is considered. To obtain the embedding of the graph, we used Graph2Vec, which is a method of finding the embedding of the graph by the subgraphs constituting a graph. We can specify the maximum path length between nodes in the process of finding subgraphs of a graph. If the maximum path length between nodes is 1, graph embedding is generated only by direct dependency between Eojeol, and graph embedding is generated including indirect dependencies as the maximum path length between nodes becomes larger. In the experiment, the maximum path length between nodes is adjusted differently from 1 to 3 depending on whether direction of dependency is considered or not, and the performance of answer extraction is measured. Experimental results show that both Eojeol tag feature and dependency graph embedding feature improve the performance of answer extraction. In particular, considering the direction of the dependency relation and extracting the dependency graph generated with the maximum path length of 1 in the subgraph extraction process in Graph2Vec as the input of the model, the highest answer extraction performance was shown. As a result of these experiments, we concluded that it is better to take into account the direction of dependence and to consider only the direct connection rather than the indirect dependence between the words. The significance of this study is as follows. First, we improved the performance of answer extraction by adding features using dependency parsing results, taking into account the characteristics of Korean, which is free of word order structure and omission of sentence components. Second, we generated feature of dependency parsing result by learning - based graph embedding method without defining the pattern of dependency between Eojeol. Future research directions are as follows. In this study, the features generated as a result of the dependency parsing are applied only to the answer extraction model in order to grasp the meaning. However, in the future, if the performance is confirmed by applying the features to various natural language processing models such as sentiment analysis or name entity recognition, the validity of the features can be verified more accurately.
Keywords
Question Answering System; Answer Extraction; Dependency Parsing; Graph Embedding; Bi-directional LSTM-CRF;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Yao, X., B. Van-Durme, C. Callison-Burch, and P. Clark, "Answer extraction as sequence tagging with tree edit distance", Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (2013), 858-867.
2 Yen, S. J., Y. C. Wu, J. C. Yang, Y. S. Lee, C. J. Lee, and J. J. Liu, "A support vector machine-based context-ranking model for question answering", Information Sciences, Vol.224(2013), 77-87.   DOI
3 Yu, L., K. M. Hermann, P. Blunsom, and S. Pulman. "Deep learning for answer sentence selection", arXiv preprint arXiv:1412.1632, (2014).
4 Abney, S., M. Colins, A. Singhal, "Answer Extraction", Proceedings of the Sixth Conference on Applied Natural Language Processing, (2000), 296-301.
5 Ahn, K. M. and Y. H. Seo, "A Korean Dependency Parsing Algorithm using Sets of Head Candidates", Journal of KISS : Software and Applications, Vol.41, No.1(2014), 88-95.
6 Choi, H. S., M. T. Kim, W. J. Kim, D. W. Shin and Y. H. Lee, "Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion", Journal of Intelligence and Information Systems Vol.24, No.4(2018), 111-136.   DOI
7 Doan-Nguyen, H., and L. Kosseim: "Improving the Precision of a Closed-Domain Question-Answering System with Semantic Information", Coupling approaches, coupling media and coupling languages for information retrieval, (2004), 850-859.
8 Huang, Z., X. Wei, and Y. Kai,. "Bidirectional LSTM-CRF models for sequence tagging", arXiv preprint arXiv: 1508.01991, (2015).
9 Hwang, H. S., J. S. Bae and C. K. Lee, "Korean Open Information Extraction using Dependency Parsing and Semantic Role Labeling", Proceedings of Korean Information Science Society, No.12(2018), 563-565.
10 Ittycheriah, A., M. Franz, W. Zhu, and A. Ratnaparkhi, "IBM's Statistical Question Answering System", In 9th Text Retrieval Conference, (2000), 229-334.
11 Kawahara, D., N. Kaji, and S. Kurohashi, "Question and answering system based on predicate-argument matching", Proceedings of the Third NTCIR, (2002), 21-24.
12 McDonald, R., F. Pereira, K. Ribarov, and J. Hajic, "Non-projective Dependency Parsing using Spanning Tree Algorithms", Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, (2005), 523-530.
13 Kim, B. S., H. J. Yu and G. B. Lee, "A Syntax-Based Hybrid System for Korean Open Information Extraction", The 27th Annual Conference on Human & Cognitive Language Technology, (2015), 41-45.
14 Kwak, S. J., B. G. Kim and J. S. Lee, "Tiplet Extraction using Korean Dependency Parsing Result", The 25th Annual Conference on Human & Cognitive Language Technology, (2013), 86-89.
15 Kwon, H. and J. Y. Choi, "A Korean Language Parser with a Unification Based Dependency Grammar", The Journal of Korea Information Science Society, Vol.19(1992), 467-476.
16 Lim, J. H., Y. J. Bae, H. K. Kim, Y. J. Kim and K. C. Lee, "Korean Dependency Guidelines for Dependency Parsing and Exo-Brain Language Analysis Corpus", The 27th Annual Conference on Human & Cognitive Language Technology, (2015), 234-239.
17 Lim, S. J., Y. T. Kim and D. Y. Ra, "Korean Dependency Parsing Based on Machine Learning of Feature Weights", Journal of KIISE: Software and Applications, Vol.38, No.4(2011), 214-223.
18 Mendes, A. C., and L. Coheur, "An approach to answer selection in question-answering based on semantic relations", Proceedings of the 22nd International Joint Conference on Artificial Intelligence, (2011), 1852-1857.
19 Narayanan, A., M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu, and S. Jaiswal, "graph2vec: Learning distributed representations of graphs", arXiv preprint arXiv:1707.05005, (2017).
20 Nivre, J. "Incrementality in deterministic dependency parsing", Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together, (2004), 50-57.
21 Shen D., and D. Klakow. "Exploring correlation of dependency relation paths for answer extraction". Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, (2006), 889-896.
22 Ravichandran, D and E. Hovy, "Learning surface text patterns for a question answering system". Proceedings of the 40th annual meeting on association for computational linguistics, (2002), 41-47.
23 Ravichandran, D., I. Abharam, and R. Salim, "Automatic derivation of surface text patterns for a maximum entropy based question answering system". Proceedings of the Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (2003).
24 Shelmanov, A., M. Kamenskaya, M. Ananyeva, and I. Smirnov, "Semantic-syntactic analysis for question answering and definition extraction", Scientific and Technical Information Processing, Vol.44, No.6(2017), 412-423.   DOI
25 Shin, H. P., "Maximally Efficient Syntactic Parsing with Minimal Resources", The 11th Annual Conference on Human & Cognitive Language Technology, (1999), 242-248.
26 Shin, S. E., D. Y. Yi and Y. H. Seo, "Korean Question-Answering System using Syntactic-Relation Information", Journal of the Korea Contents Association, Vol.4, No.2(2004), 36-42.
27 Soubbotin, M. M. and S. M. Soubbotin, "Patterns for potential answer expressions as clues to the right answers", Proceedings of the 10th Text REtrieval Conference, (2001).
28 Punyakanok, V., D. Roth, and W. Yih, "Mapping dependency trees: An application to question answering", The 8th International Symposium on Artificial Intelligence and Mathematics, (2004).