[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9708/jksci.2020.25.11.027

The Ability of L2 LSTM Language Models to Learn the Filler-Gap Dependency

Kim, Euhee (Dept. of Computer Science & Engineering, Shinhan University)

Publication Information

Journal of the Korea Society of Computer and Information / v.25, no.11, 2020 , pp. 27-40 More about this Journal

Abstract

In this paper, we investigate the correlation between the amount of English sentences that Korean English learners (L2ers) are exposed to and their sentence processing patterns by examining what Long Short-Term Memory (LSTM) language models (LMs) can learn about implicit syntactic relationship: that is, the filler-gap dependency. The filler-gap dependency refers to a relationship between a (wh-)filler, which is a wh-phrase like 'what' or 'who' overtly in clause-peripheral position, and its gap in clause-internal position, which is an invisible, empty syntactic position to be filled by the (wh-)filler for proper interpretation. Here to implement L2ers' English learning, we build LSTM LMs that in turn learn a subset of the known restrictions on the filler-gap dependency from English sentences in the L2 corpus that L2ers can potentially encounter in their English learning. Examining LSTM LMs' behaviors on controlled sentences designed with the filler-gap dependency, we show the characteristics of L2ers' sentence processing using the information-theoretic metric of surprisal that quantifies violations of the filler-gap dependency or wh-licensing interaction effects. Furthermore, comparing L2ers' LMs with native speakers' LM in light of processing the filler-gap dependency, we not only note that in their sentence processing both L2ers' LM and native speakers' LM can track abstract syntactic structures involved in the filler-gap dependency, but also show using linear mixed-effects regression models that there exist significant differences between them in processing such a dependency.

Keywords

LSTM language model; English sentence processing; filler-gap dependency; surprisal; linear mixed-effects regression model; wh-licensing interaction effects;

Citations & Related Records

Times Cited By KSCI : 6 (Citation Analysis)

Reference
Cited By KSCI

1	J. L. Elman, "Distributed representations, simple recurrent networks, and grammatical structure," Machine learning, Vol. 7(2-3), pp. 195-225, Sep 1991. DOI
2	Y. Goldberg, "Neural network methods for natural language processing," Synthesis lectures on Human language Technologies, Vol. 10(1), pp. 1-309, Apr 2017. DOI
3	S. Hochreiter and S. Jurgen, "Long short-term memory," Neural Computation, Vol. 9(8), pp. 1735-1780, Nov 1997. DOI
4	E. Kim, "Sentence Comprehension with an LSTM Language Model," Journal of Digital Contents Society, Vol, 19(12), pp. 2393-2401, Dec 2018. DOI
5	T. Linzen, E. Dupoux, and Y. Goldberg, "Assessing the ability of LSTMs to learn syntax-sensitive dependencies," Transactions of the Association for Computational Linguistics, Vol. 4, pp. 521-535, Dec 2016. DOI
6	K. Gulordava, P. Bojanowski, E. Grave, T. Linzen, and M. Baroni, "Colorless green recurrent networks dream hierarchically," NAACL-HLT, pp. 1195-1205, Jun 2018.
7	E. Wilcox, R. Levy, T. Morita, and R. Futell, "What do RNN Language Models Learn about Filler-Gap Dependencies?," ACL Anthology, Proceedings of the 2018 EMNLP Workshop Blackbox NLP: Analyzing and Interpreting Neural Networks for NLP, pp. 211-221, Aug 2019.
8	A. Kuncoro, C. Dyer, J. Hale, D. Yogatama, S. Clark, and P. Blunsom, "LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better," Computational Linguistics, Vol. 1, pp. 1426-1436, Aug 2018.
9	K. Tran, A. Bisazza, and C. Monz, "The importance of being recurrent for modeling hierarchical structure," In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Aug 2018.
10	J. Hale, "Uncertainty about the rest of the sentence," Cognitive Science, Vol. 30(4), pp. 609-642, Jul 2006. DOI
11	E. Kim, "The Unsupervised Learning-based Language Modeling of Word Comprehension in Korean," The Journal of the Korea Society of Computer and Information, pp. 41-49, Nov 2019.
12	J. Hale, "A probabilistic Earley parser as a psycholinguistic model," Proceedings of the Second meeting of the North American Chapter of the Association for Computational Linguistic and Language Technologies, pp. 1-8, Jun 2001.
13	R. Levy, "Expectation-based syntactic comprehension," Cognition, Vol. 106(3), pp. 1126-1177, Mar 2008. DOI
14	R.H. Baayen, D.J. Davidson, and D.M. Bates, "Mixed-effects modeling with crossed random effects for subjects and items," Journal of memory and language, Vol. 59(4), pp. 390-412, Mar 2008. DOI
15	E. Kim, M. Park, and W. Chung, "On Korean English L2ers' processing of wh-filler-gap dependencies: An ERP study," Language and Information , Vol. 21(3), pp. 1-24, Nov 2017. DOI
16	E. Kim, "A Deep Learning-based Article- and Paragraph-level Classification," The Journal of the Korea Society of Computer and Information, pp. 31-41, Nov 2018.