Browse > Article
http://dx.doi.org/10.9708/jksci.2022.27.03.013

Probing Sentence Embeddings in L2 Learners' LSTM Neural Language Models Using Adaptation Learning  

Kim, Euhee (Dept. of Computer Science & Engineering, Shinhan University)
Abstract
In this study we leveraged a probing method to evaluate how a pre-trained L2 LSTM language model represents sentences with relative and coordinate clauses. The probing experiment employed adapted models based on the pre-trained L2 language models to trace the syntactic properties of sentence embedding vector representations. The dataset for probing was automatically generated using several templates related to different sentence structures. To classify the syntactic properties of sentences for each probing task, we measured the adaptation effects of the language models using syntactic priming. We performed linear mixed-effects model analyses to analyze the relation between adaptation effects in a complex statistical manner and reveal how the L2 language models represent syntactic features for English sentences. When the L2 language models were compared with the baseline L1 Gulordava language models, the analogous results were found for each probing task. In addition, it was confirmed that the L2 language models contain syntactic features of relative and coordinate clauses hierarchically in the sentence embedding representations.
Keywords
LSTM; (L2) language model; probing method; syntactic priming; adaptation effect;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. R. Bellegarda, "An Overview of Statistical Language Model Adaptation," ITRW on Adaptation Methods for Speech Recognition, pp. 29-30, Aug 2001.
2 E. Kim, "Sentence Comprehension with an LSTM Language Model," Journal of Digital Contents Society, Vol. 19(12), pp. 2393-2401, Dec 2018.   DOI
3 A. Vaswani et al., "Attention Is All You Need," June 2017, DOI:arXiv.1706.03762.
4 T. Linzen et al., "Assessing the ability of LSTMs to learn syntax-sensitive dependencies,", pp. 521-535, Nov 2016, DOI:arXiv.1611.01368.
5 K. Gulordava et al., "Colorless green recurrent networks dream hierarchically," Mar 2018, DOI: arXiv.1803.11138.
6 R. T. McCoy et al., "Revisiting the poverty of the stimulus: Hierarchical generalization without a hierarchical bias in recurrent neural networks," pp. 2093-2098, June 2018, DOI: arXiv.1802.09091.
7 M. van Schijndel et al., "A neural model of adaptation in reading," pp. 4704-4710, Oct 2018, DOI: 10.18653/v1/D18-1499.
8 G. Prasad et al., "Using priming to uncover the organization of syntactic representations in neural language models," pp. 66-76, Nov 2019, DOI: arXiv.1909.10579.
9 H. P Branigan et al., "Syntactic priming across highly similar languages is not affected by language proficiency," Oct 2021, DOI: 10.1080/23273798.2021.1994620.
10 S. Hochreiter and S. Jurgen, "Long short-term memory," Neural Computation, Vol. 9(8), pp. 1735-1780, Nov 1997.   DOI
11 A. Conneau et al., "What you can cram into a single \$&!#* vector: Probing sentence embeddings for linguistic properties", June 2018, DOI: arXiv.1805.01070.
12 E. Kim et al., L2ers' predictions of syntactic structure and reaction times during sentence processing. Linguistic Research 37, pp. 189-218, 2020.   DOI
13 Tianyu Gao et al., "Making Pre-trained Language Models Better Few-shot Learners," ACL, pp. 3816-3830, Aug 2021.
14 Schick and Schutze, "It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners," NAACL, Apr 2021, DOI: arXiv.2009.07118.
15 M. van Schijndel et al., "Quantity doesn't buy quality syntax with neural language models," Aug 2019, DOI: arXiv.1909.00111.
16 J. Hale et al., "Quantifying Structural and Non-structural Expectations in Relative Clause processing," Jan 2021, DOI:10.1111/cogs.12927.
17 J. Hewitt et al., "A Structural Probe for Finding Syntax in Word Representations," pp. 4129-4138, June 2019, DOI: 10.18653/v1/N19-1419.
18 J. B Wells et al., "Experience and sentence processing: Statistical learning and relative clause comprehension," Cognitive Psychology, 58, pp. 250-271, Mar 2009.   DOI