과제정보
This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2018-0-01405) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation) and supported by Basic Science Research Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(NRF-2021R1A6A1A03045425).
참고문헌
- R. Bommasani et al. (2021). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.
- T. Brown et al. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165.
- B. Lester, R. Al-Rfou & N. Constant. (2021). The power of scale for parameter-efficient prompt tuning. arXiv preprint arXiv:2104.08691.
- A. Vaswani et al. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
- J. Devlin, M. Chang, K. Lee & K. Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Y. Liu et al. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- G. Lample & A. Conneau. (2019). Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291.
- Lewis. M et al. (2019). Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461.
- K. Clark, M.-T. Luong, Q. V. Le & C. D. Manning. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
- C. Lee, K. Yang, T. Whang, C. Park, A. Matteson & H. Lim. (2021). Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models. Applied Sciences, 11(5), 1974. https://doi.org/10.3390/app11051974
- J. Park. (2020). KoELECTRA: Pretrained ELECTRA model for Korean. https://github.com/monologg/KoELECTRA
- J. Lee. (2020). Kcbert: Korean comments bert. In Annual Conference on Human and Language Technology (pp. 437-440).
- J. Lee. (2021). KcELECTRA: Korean comments ELECTRA. GitHub repository. Opgehaal van https://github.com/Beomi/KcELECTRA
- J. Park. (2019). DistilKoBERT: Distillation of KoBERT. GitHub repository. Opgehaal van https://github.com/monologg/DistilKoBERTc
- J. Park & D. Kim. (2021). KoBigBird: Pretrained BigBird Model for Korean (Version 1.0.0). doi:10.5281/zenodo.5654154
- B. Kim et al. (2021) What changes can large-scale language models bring? intensive study on hyperclova: Billions-scale korean generative pretrained transformers. arXiv preprint arXiv:2109.04650.
- I. Kim, G. Han, J. Ham & W. Baek. (2021). KoGPT: KakaoBrain Korean(hangul) Generative Pre-trained Transformer. Opgehaal van https://github.com/kakaobrain/kogpt
- K. Airc. (2021. Mar). KE-T5: Korean English T5. Opgehaal van. https://github.com/AIRC-KETI/ke-t5
- Y. Wu et al. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
- T. Kudo & J. Richardson. (2018). Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226.
- A. Radford et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
- R. Sennrich, B. Haddow & A. Birch. (2015). Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909.
- H. Lee, J. Yoon, B. Hwang, S. Joe, S. Min & Y. Gwon. (2021). KoreALBERT: Pretraining a Lite BERT Model for Korean Language Understanding. 2020 25th International Conference on Pattern Recognition (ICPR), 5551-5557. IEEE.
- Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma & R. Soricut. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942.
- C. Raffel et al. (2019). Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683.
- S. Park et al. (2021). KLUE: Korean Language Understanding Evaluation. arXiv preprint arXiv:2105.09680.
- S. Lee, H. Jang, Y. Baik, S. Park & H. Shin. (2020). Kr-bert: A small-scale korean-specific language model. arXiv preprint arXiv:2008.03979.
- M. Zaheer et al. (2020). Big Bird: Transformers for Longer Sequences. NeurIPS.
- I. Yamada, K. Washio. H. Shindo & Y. Matsumoto. (2019). Global entity disambiguation with pretrained contextualized embeddings of words and entities. arXiv preprint arXiv:1909.00426.
- R. Ri, I. Yamada & Y. Tsuruoka. (2021). mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models. arXiv preprint arXiv:2110.08151.
- J. Yang, S. Ma, D. Zhang, S. Wu, Z. Li & M. Zhou. (2020). Alternating language modeling for cross-lingual pre-training. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 9386-9393.
- Z. Chi et al. (2020). Infoxlm: An information-theoretic framework for cross-lingual language model pre-training. arXiv preprint arXiv:2007.07834.
- Z. Chi et al. (2021). Xlm-e: Cross-lingual language model pre-training via electra. arXiv preprint arXiv:2106.16138.
- Z. Chi et al. (2021). Improving pretrained cross-lingual language models via self-labeled word alignment. arXiv preprint arXiv:2106.06381.
- H. Huang et al. (2019). Unicoder: A universal language encoder by pre-training with multiple cross-lingual tasks. arXiv preprint arXiv:1909.00964.
- B. A. Richards et al. (2019). A deep learning framework for neuro- science. Nature neuroscience, Vol. 22, No. 11, pp. 1761-1770. https://doi.org/10.1038/s41593-019-0520-2
- Y. Liu et al. (2020). Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, 8, 726-742. https://doi.org/10.1162/tacl_a_00343
- Y. Tang et al. (2020). Multilingual translation with extensible multilingual pretraining and finetuning. arXiv preprint arXiv:2008.00401.
- K. Song, X. Tan, T. Qin, J. Lu & T.-Y. Liu. (2019). Mass: Masked sequence to sequence pre-training for language generation. arXiv preprint arXiv:1905.02450.
- Z. Chi, L. Dong, F. Wei, W. Wang, X.-L. Mao & H. Huang. (2020). Cross-lingual natural language generation via pre-training. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 7570-7577.
- L. Xue et al. (2020). mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.
- F. Luo et al. (2020). Veco: Variable encoder-decoder pre-training for cross-lingual understanding and generation. arXiv preprint arXiv:2010.16046.
- Z. Chi et al. (2021). mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs. arXiv preprint arXiv:2104.08692.
- G. Attardi. (2015). WikiExtractor. GitHub repository. Opgehaal van. https://github.com/attardi/wikiextractor
- V. Sanh, L. Debut, J. Chaumond & T. Wolf. (2019). Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. ArXiv, Vol. abs/1910.01108.
- A. Conneau et al. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
- J. Hu, M. Johnson, O. Firat, A. Siddhant & G. Neubig. (2020). Explicit alignment objectives for multilingual bidirectional encoders. arXiv preprint arXiv:2010.07972.
- W. Qi et al. (2021). Prophetnet-x: Large-scale pre-training models for english, chinese, multi-lingual, dialog, and code generation. arXiv preprint arXiv:2104.08006.