과제정보
이 논문은 2021학년도 경기대학교 연구년 수혜로 연구되었음.
참고문헌
- T. Lin et al., "A survey of transformers", AI Open, 2022.
- A. Dosovitskiy et al., "An image is worth 16x16 words: Transformers for image recognition at scale.", arXiv preprint arXiv:2010.11929, 2020.
- M. Chen, et al., "Generative pretraining from pixels", International conference on machine learning. PMLR, 2020.
- C. Subakan et al., "Attention is all you need in speech separation", ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021.
- H. Akbari et al., "Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text" Advances in Neural Information Processing Systems, Vol.34, pp.24206-24221, 2021.
- H. Li, "Language models: past, present, and future", Communications of the ACM, Vol.65, No.7, pp56-63, 2022. https://doi.org/10.1145/3490443
- A. Radford et al., "Language models are unsupervised multitask learners.", OpenAI blog Vol.1, No.8, p.9, 2019.
- J. Devlin et al., "Bert: Pre-training of deep bidirectional transformers for language understanding", arXiv preprint arXiv:1810.04805, 2018.
- T. Brown et al., "Language models are few-shot learners", Advances in neural information processing systems, Vol.33, pp.1877-1901, 2020.
- S.J. Pan an d Y. Qian g, "A survey on transfer learning", IEEE Transactions on knowledge and data engineering, Vol.22, No.10, pp.1345-1359, 2009.
- Y. Bengio et al., "A neural probabilistic language model", Advances in neural information processing systems, Vol.13, 2000.
- T. Mikolov et al., "Efficient estimation of word representations in vector space", arXiv preprint arXiv:1301.3781. 2013.
- J. Sarzynska-Wawer et al., "Detecting formal thought disorder by deep contextualized word representations", Psychiatry Research, Vol.304, p.114135, 2021.
- A. Vaswani et al., "Attention is all you need", Advances in neural information processing systems, Vol.30, pp.5998-6008, 2017.
- J.L. Ba et al., "Layer normalization", arXiv preprint arXiv:1607.06450, 2016.
- Y. Wu et al., "Google's neural machine translation system: Bridging the gap between human and machine translation", arXiv preprint arXiv:1609. 08144, 2016.
- A. Radford et al., "Improving language understanding by generative pre-training", 2018.
- A. Wang et al., "GLUE: A multi-task benchmark and analysis platform for natural language understanding", arXiv preprint arXiv:1804.07461, 2018.
- A. Wang et al., "Superglue: A stickier benchmark for general-purpose language understanding systems", Advances in neural information processing systems, Vol.32, 2019.
- Y. Liu et al., "Roberta: A robustly optimized bert pretraining approach", arXiv preprint arXiv:1907. 11692, 2019.
- Z. Lan et al., "Albert: A lite bert for self-supervised learning of language representations", arXiv preprint arXiv:1909.11942, 2019.
- K. Clark et al., "Electra: Pre-training text encoders as discriminators rather than generators", arXiv preprint arXiv:2003.10555, 2020.
- M. Lewis et al., "Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension", arXiv preprint arXiv:1910.13461, 2019.
- https://aiopen.etri.re.kr/service_dataset.php, 2019.
- https://github.com/SKTBrain/KoBERT, 2019.
- S. Lee et al., "Kr-bert: A small-scale korean-specific language model", arXiv preprint arXiv:2008.03979, 2020.
- https://github.com/monologg/KoELECTRA, 2020.
- https://huggingface.co/xlm-roberta-base
- https://aida.kisti.re.kr/data/107ca6f3-ebcb-4a64-87d5-cea412b76daf, 2021.
- https://github.com/SKT-AI/KoGPT2, 2020.
- https://github.com/haven-jeon/kogpt2-chatbot, 2022.
- https://github.com/kakaobrain/kogpt, 2021.
- https://github.com/SKT-AI/KoBART, 2020.