Recent R&D Trends for Pretrained Language Model

Lim, J.H.;Kim, H.K.;Kim, Y.K.;

doi:10.22648/ETRI.2020.J.350302

전자통신동향분석 (Electronics and Telecommunications Trends)

제35권3호
/
Pages.9-19
/
2020
/
1225-6455(pISSN)

한국전자통신연구원 (Electronics and Telecommunications Research Institute)

DOI QR Code

딥러닝 사전학습 언어모델 기술 동향

Recent R&D Trends for Pretrained Language Model

임준호 (언어지능연구실) ;
김현기 (언어지능연구실) ;
김영길 (언어지능연구실)

발행 : 2020.06.01

https://doi.org/10.22648/ETRI.2020.J.350302 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Recently, a technique for applying a deep learning language model pretrained from a large corpus to fine-tuning for each application task has been widely used as a language processing technology. The pretrained language model shows higher performance and satisfactory generalization performance than existing methods. This paper introduces the major research trends related to deep learning pretrained language models in the field of language processing. We describe in detail the motivations, models, learning methods, and results of the BERT language model that had significant influence on subsequent studies. Subsequently, we introduce the results of language model studies after BERT, focusing on SpanBERT, RoBERTa, ALBERT, BART, and ELECTRA. Finally, we introduce the KorBERT pretrained language model, which shows satisfactory performance in Korean language. In addition, we introduce techniques on how to apply the pretrained language model to Korean (agglutinative) language, which consists of a combination of content and functional morphemes, unlike English (refractive) language whose endings change depending on the application.

키워드

딥러닝 사전학습 언어모델

과제정보

이 논문은 2020년도 정부(과학기술정보통신부)의 재원으로 정보통신기획평가원의 지원을 받아 수행된 연구임[No. 2013-0-00131, (엑소브레인-총괄/1세부) 휴먼 지식증강 서비스를 위한 지능진화형 WiseQA 플랫폼 기술 개발].

참고문헌

J. Devlin et al., "BERT: Pre-training of deep bidirectional transformers for language understanding," in Proc. North Am. Association Computat. Linguistics (NAACL)-HLT, Minneapolis, MN, USA, June 2-7, 2019, pp. 4171-4186.
T. Mikolov et al., "Distributed representations of words and phrases and their compositionality," in Proc. Int. Conf. Neural Inf. Process. Syst., 2013, pp. 3111-3119, doi: 10.5555/2999792.2999959.
P. Bojanowski et al., "Enriching word vectors with subword information," Trans. Assoc. Comput. Linguistics, vol. 5, Dec. 2017, pp. 135-146. https://doi.org/10.1162/tacl_a_00051
A. Vaswani et al., "Attention is all you need," in Proc. Neural Inf. Process. Syst., Long Beach, CA, USA, 2017, pp. 30-34.
https://gluebenchmark.com/
https://rajpurkar.github.io/SQuAD-explorer/
Y. Sun et al., "ERNIE: Enhanced Representation through Knowledge Integration," arXiv preprint arXiv:1904.09223, 2019.
K. Song et al., "Mass: Masked sequence to sequence pretraining for language generation," in Int. Conf. Mach. Learning, Long Beach, CA, USA, 2019, pp. 5926-5936.
L. Dong et al., "Unified language model pre-training for natural language understanding and generation," arXiv preprint arXiv:1905.03197, 2019.
Z. Yang et al., "XLNet: Generalized autoregressive pretraining for language understanding," arXiv preprint 1906.08237, 2019.
M. Joshi et al., "SpanBERT: Improving pre-training by representing and predicting spans," arXiv preprint 1907.10529, 2019.
Y. Liu et al., "RoBERTa: A Robustly Optimized BERT Pretraining Approach," arXiv:1907.11692, 2019.
Z. Lan et al., "ALBERT: A Lite BERT for Selfsupervised Learning of Language Representations," in Int. Conf. Learning Representations, Addis Ababa, Ethiopia, May 2020.
M. Lewis et al., "Bart: Denoising sequence-to-sequence pretraining for natural language generation, translation, and comprehension." arXiv preprint arXiv:1910.13461, 2019.
K. Clark et al., "ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators." in Int. Conf. Learning Representations, Addis Ababa, Ethiopia, May 2020.
H. Bao et al., "UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training." arXiv preprint arXiv:2002.12804, 2020.
http://aiopen.etri.re.kr/service_dataset.php

전자통신동향분석 (Electronics and Telecommunications Trends)

딥러닝 사전학습 언어모델 기술 동향

Recent R&D Trends for Pretrained Language Model

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)