Korean TableQA: Structured data question answering based on span prediction style with S<sup>3</sup>-NET

Park, Cheoneum;Kim, Myungji;Park, Soyoon;Lim, Seungyoung;Lee, Jooyoul;Lee, Changki;

doi:10.4218/etrij.2019-0189

ETRI Journal

Volume 42 Issue 6
/
Pages.899-911
/
2020
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Korean TableQA: Structured data question answering based on span prediction style with S³-NET

Park, Cheoneum (AIR Lab, HYUNDAI MOTOR COMPANY) ;
Kim, Myungji (AI Research, LG CNS) ;
Park, Soyoon (AI Research, LG CNS) ;
Lim, Seungyoung (AI Research, LG CNS) ;
Lee, Jooyoul (AI Research, LG CNS) ;
Lee, Changki (Computer Science, Kangwon National University)

Received : 2019.04.09
Accepted : 2020.03.09
Published : 2020.12.14

https://doi.org/10.4218/etrij.2019-0189 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The data in tables are accurate and rich in information, which facilitates the performance of information extraction and question answering (QA) tasks. TableQA, which is based on tables, solves problems by understanding the table structure and searching for answers to questions. In this paper, we introduce both novice and intermediate Korean TableQA tasks that involve deducing the answer to a question from structured tabular data and using it to build a question answering pair. To solve Korean TableQA tasks, we use S3-NET, which has shown a good performance in machine reading comprehension (MRC), and propose a method of converting structured tabular data into a record format suitable for MRC. Our experimental results show that the proposed method outperforms a baseline in both the novice task (exact match (EM) 96.48% and F1 97.06%) and intermediate task (EM 99.30% and F1 99.55%).

Keywords

Acknowledgement

This research was supported by LG CNS under the Question Answering project for formatted documents and Korea Electric Power Corporation. (Grant number:R18XA05)

References

F. Hill et al., The goldilocks principle: reading children's books with explicit memory representations, arXiv preprint arXiv:1511.02301, 2015, pp. 1-13.
P. Rajpurkar et al., Squad: 100,000+ questions for machine comprehension of text, arXiv preprint arXiv:1606.05250, 2016, pp. 1-10.
D. Chen et al., Reading wikipedia to answer open-domain questions, arXiv preprint arXiv:1704.00051, 2017, pp. 1-10.
P. Bajaj et al., Ms marco: A human-generated machine reading comprehension dataset, arXiv preprint arXiv:1611.09268, 2016, pp. 1-11.
P. Rajpurkar, R. Jia, and P. Liang, Know what you don't know: Unanswerable questions for squad, arXiv preprint arXiv:1806.03822, 2018, pp. 1-9.
V. Zhong, C. Xiong, and R. Socher, Seq2sql: Generating structured queries from natural language using reinforcement learning, arXiv preprint arXiv:1709.00103, 2017, pp. 1-12.
P. Pasupat and P. Liang, Compositional semantic parsing on semi-structured tables, arXiv preprint arXiv:1508.00305, 2015, pp. 1-11.
Z. Lu, H. Li, and B. Kao, Neural enquirer: learning to query tables in natural language, IEEE Data Eng. Bull. 39 (2016), no. 3, 63-73.
K. Nishida et al., Understanding the semantic structures of tables with a hybrid deep neural network architecture, in Proc. ThirtyFirst AAAI Conf. Artif. Intell. (SanFrancisco, CA, USA), Feb. 2017, pp. 168-174.
A. Graves, A. Mohamed, and G. Hinton, Speech recognition with deep recurrent neural networks, in Proc. IEEE Int. Conf. Acoustics, Speech Signal Process. (Vancouver, Canada), May, 2013, pp. 6645-6649.
Y. Kim, Convolutional neural networks for sentence classification, arXiv preprint arXiv:1408.5882, 2014, pp. 1-6.
M. Ghasemi-Gol and P. Szekely, Tabvec: Table vectors for classification of web tables, arXiv preprint arXiv:1802.06290, 2018, pp. 1-9.
S.K. Jauhar, P. Turney, and E. Hovy, Tables as semi-structured knowledge for question answering, in Proc. Annu. Meeting Association Comput. Linguistics (Berlin, Germany), Aug. 2016, pp. 474-483.
A. Morales et al., Learning to answer questions from wikipedia infoboxes, in Proc. Conf. Empirical Methods Natural Language Process. (Austin, TX, USA), Nov. 2016, pp. 1930-1935.
S. Vakulenko and V. Savenkov, Tableqa: Question answering on tabular data, arXiv preprint arXiv:1705.06504, 2017, pp. 1-5.
S. Sukhbaatar , J. Weston, and R. Fergus. End-to-end memory networks, in Proc. Adv. Neural Inf. Process. Syst. (Montreal, Canada), Dec. 2015, pp. 2440-2448.
W. Wang et al., Gated self-matching networks for reading comprehension and question answering, in Proc. Annu. Meeting Association Comput. Linguistics (Vancouver, Canada), July 2017, pp. 189-198.
M. Seo et al., Bidirectional attention flow for machine comprehension, arXiv preprint arXiv:1611.01603, 2016, pp. 1-13.
H.-Y. Huang et al., Fusionnet: Fusing via fully-aware attention with application to machine comprehension, arXiv preprint arXiv:1711.07341, 2017, pp. 1-20.
X. Liu et al., Stochastic answer networks for machine reading comprehension, arXiv preprint arXiv:1712.03556, 2017, pp. 1-11.
F. Sun et al., U-net: Machine reading comprehension with unanswerable questions, arXiv preprint arXiv:1810.06638, 2018, pp. 1-9.
C. Park et al., S³-net: Korean machine reading comprehension using sru-based sentence and self matching networks, Proceeding of KSC (2017), 649-651.
T. Lei, Y. Zhang, and Y. Artzi, Training rnns as fast as cnns, arXiv preprint arXiv:1709.02755, 2017.
O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, in Proc. Adv. Neural Inf. Process. Syst. (Montreal, Canada), Dec. 2015, pp. 2692-2700.
T. Mikolov et al., Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781, 2013, pp. 1-12.
D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980, 2014, pp. 1-15.
J. Chung et al., Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv preprint arXiv:1412.3555, 2014, pp. 1-9.
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural computation 9 (1997), no. 8, 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735
C. Park et al., S²-net: Machine reading comprehension with sru-based self-matching networks, ETRI J. 41 (2019), no. 3, 371-382. https://doi.org/10.4218/etrij.2017-0279
S. Wang and J. Jiang, Machine comprehension using match-lstm and answer pointer, arXiv preprint arXiv:1608.07905, 2016, pp. 1-11.
J. Devlin et al., Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805, 2018, pp. 1-16.
T. Kwiatkowski et al., Natural questions: a benchmark for question answering research. Trans. Association Comput. Linguistics. 7 (2019), 453-466. https://doi.org/10.1162/tacl_a_00276

ETRI Journal

Korean TableQA: Structured data question answering based on span prediction style with S3-NET

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)

Korean TableQA: Structured data question answering based on span prediction style with S³-NET