Browse > Article
http://dx.doi.org/10.4218/etrij.2018-0467

VS3-NET: Neural variational inference model for machine-reading comprehension  

Park, Cheoneum (Computer Science Departmemt, Kangwon National University)
Lee, Changki (Computer Science Departmemt, Kangwon National University)
Song, Heejun (Artificial Intelligence Center, Samsung Electronics Co., Samsung Research)
Publication Information
ETRI Journal / v.41, no.6, 2019 , pp. 771-781 More about this Journal
Abstract
We propose the VS3-NET model to solve the task of question answering questions with machine-reading comprehension that searches for an appropriate answer in a given context. VS3-NET is a model that trains latent variables for each question using variational inferences based on a model of a simple recurrent unit-based sentences and self-matching networks. The types of questions vary, and the answers depend on the type of question. To perform efficient inference and learning, we introduce neural question-type models to approximate the prior and posterior distributions of the latent variables, and we use these approximated distributions to optimize a reparameterized variational lower bound. The context given in machine-reading comprehension usually comprises several sentences, leading to performance degradation caused by context length. Therefore, we model a hierarchical structure using sentence encoding, in which as the context becomes longer, the performance degrades. Experimental results show that the proposed VS3-NET model has an exact-match score of 76.8% and an F1 score of 84.5% on the SQuAD test set.
Keywords
machine reading comprehension; question answering; SQuAD; variational inference; VS3-NET;
Citations & Related Records
연도 인용수 순위
  • Reference
1 P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, Squad: 100,000+ questions for machine comprehension of text, 2016, arXiv preprint arXiv:1606.05250.
2 D. Chen, A. Fisch, J. Weston, and A. Bordes, Reading Wikipedia to answer open-domain questions, 2017, arXiv preprint arXiv:1704.00051.
3 D. Weissenborn, G. Wiese, and L. Seiffe, Making neural QA as simple as possible but not simpler, in Proc. Conf. Comput. Natural Lang. Learn. (CoNLL 2017), Vancouver, Canada, Aug. 2017, pp. 1-12.
4 W. Wang et al., Gated self-matching networks for reading comprehension and question answering, in Proc. Ann. Mtg. Assoc. Comput. Ling., Vancouver, Canada, July 2017, pp. 189-198.
5 M. Seo et al., Bidirectional attention flow for machine comprehension, 2016, arXiv preprint arXiv:1611.01603.
6 O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, in Ann. Conf. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 2692-2700.
7 D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, in Proc. ICLR' 15, San Diego, CA, USA, May 2015, arXiv preprint arXiv:1409.0473.
8 Y. Miao, L. Yu, and P. Blunsom, Neural variational inference for text processing, 2016, arXiv preprint arXiv:1511.06038.
9 B. Zhang et al., Variational neural discourse relation recognizer, 2016, arXiv preprint arXiv:1603.03876.
10 D. P. Kingma and M. Welling, Auto-encoding variational bayes, 2013, arXiv preprint arXiv:1312.6114.
11 D. J. Rezende, S. Mohamed, and D. Wierstra, Stochastic backpropagation and approximate inference in deep generative models, in Proc. Int. Conf. Mach. Learning, Beijing, China, June 21-26, 2014, pp. 1278-1286.
12 T. Lei, Y. Zhang, and Y. Artzi, Training RNNs as fast as CNNs, 2017, arXiv preprint arXiv:1709.02755.
13 S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computat. 9 (1997), 1735-1780.   DOI
14 D. Ferrucci et al., Watson: Beyond jeopardy!, Artif. Intel. 199-200 (2013), 93-105.   DOI
15 Y. Yang, W.-T. Yih, and C. Meek, WIKIQA: A challenge dataset for open-domain question answering, in Proc. Conf. Empir. Methods Natural Lang. Process., Lisbon, Portugal, Sept. 2015, pp. 2013-2018.
16 K. M. Hermann et al., Teaching machines to read and comprehend, in Proc. Int. Conf. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 1693-1701.
17 M. E. Peters et al., Deep contextualized word representations, 2018, arXiv preprint arXiv:1802.05365.
18 H. Lee, H. Kim, and Y. Lee, GF-Net: High-performance machine Reading Comprehension through Feature Selection Feature Selection, in Proc. KCC, 2018, pp. 598-600.
19 W. Y. Adams et al., QANet: Combining local convolution with global self-attention for reading comprehension, 2018, arXiv preprint arXiv: 1804.09541.
20 J. Chung et al., Recurrent latent variable model for sequential data, in Proc. Int. Conf. Neural Inform. Process. Syst., Montreal, Canada, Dec. 7-12, 2015, pp. 2980-2988.
21 B. Zhang et al., Variational neural machine translation, 2016, arXiv preprint arXiv:1605.07869.
22 C. Clark and M. Gardner, Simple and effective multi-paragraph reading comprehension, 2017, arXiv preprint arXiv:1710.10723.
23 K. Cho et al., Learning Phrase Representations using RNN encoder-decoder for statistical machine translation, 2014, arXiv preprint arXiv:1406.1078.
24 J. Pennington, R. Socher, and C. Manning, Glove: Global vectors for word representation, in Proc. Conf. Empirical Methods Nat. Lang. Process., Doha, Qatar, 2014, pp. 1532-1543.
25 S. Wang and J. Jiang, Machine comprehension using match-LSTM and answer pointer, 2016, arXiv preprint arXiv:1608.07905.
26 Y. Kim, Convolutional neural networks for sentence classification, in Proc. Conf. Empirical Methods Nat. Lang. Process., Doha, Qatar, 2014, pp. 1746-1751.
27 C. Park and C. Lee, Coreference resolution using hierarchical pointer networks, KIISE Trans. Comput. Practices 23 (2017), 542-549.   DOI
28 D. Kingma and J. Ba, ADAM: A method for stochastic optimization, 2015, arXiv preprint arXiv:1412.6980.
29 R. Liu et al., Structural embedding of syntactic trees for machine comprehension, 2017, arXiv preprint arXiv:1703.00572.
30 Y. Shen et al., ReasoNet: Learning to stop reading in machine comprehension, 2017, arXiv preprint arXiv: 1609.05284.
31 J. Zhang et al., Exploring question understanding and adaptation in neural-network-based question answering, 2017, arXiv preprint arXiv:1703.04617.
32 H.-Y. Huang et al., Fusionnet: Fusing via fully-aware attention with application to machine comprehension, 2017, arXiv preprint arXiv:1711.07341.
33 Z. Chen et al., Smarnet: Teaching machines to read and comprehend like human, 2017, arXiv preprint arXiv:1710.02772.
34 M. Hu, Y. Peng, and X. Qiu, Reinforced mnemonic reader for machine comprehension, 2017, arXiv preprint arXiv:1705.02798.
35 R. Liu et al., Phase conductor on multi-layered attentions for machine comprehension, 2017, arXiv preprint arXiv:1710.10504.
36 C. Park et al., S2-Net: Korean machine reading comprehension with SRU-based Self matching network, in Proc. KIISE for HCLT, 2017, pp. 35-40.
37 L. van der Maaten and G. Hinton, Visualizing Data using t-SNE, J. Machine Learn. Res. 9 (2008), 2579-2605.