과제정보
Jung's work has been partially supported by National Research Foundation of Korea (NRF) grants funded by the Korean government (MIST) 2022R1F1A1071126 and by a Korea University Grant (K2305251).
참고문헌
- Bayer M, Kaufhold M, and Reuter C (2022). A survey on data augmentation for text classification, ACM Computing Surveys, 55, 1-39.
- Chen T, Kornblith S, Norouzi M, and Hinton G (2020). A simple framework for contrastive learning of visual representations, Proceedings of the 37th International Conference on Machine Learning, 119, 1597-1607.
- Cho J, Jeong M, Lee J, and Cheong Y (2019). Transformational data augmentation techniques for Korean text data, Proceedings of the Korean Institute of Information Scientists and Engineers Conference, 47, 592-594.
- Choi M and On B (2019). A Comparative Study on the Accuracy of Sentiment Analysis of Bi-LSTM Model by Morpheme Feature Proceedings of KIIT Conference, 307-309.
- Choi Y and Lee KJ (2020). Performance analysis of Korean morphological analyzer based on transformer and BERT, Journal of The Korean Institute of Information Scientists and Engineers, 47, 730-741.
- Clark K, Luong M, Le Q, and Manning C (2020). ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators, Available from: arXiv, https://arxiv.org/abs/2003.10555
- Devlin J, Chang M, Lee K, and Toutanova K (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, 4171-4186.
- Dong L, Yang N, Wang W et al. (2019). Unified language model pre-training for natural language understanding and generation, Available from: arXiv, http://arxiv.org/abs/1905.03197
- Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, and Bengio Y (2014). Generative adversarial nets, Advances in Neural Information Processing Systems, 2672-2680.
- Han S (2015). py-hanspell GitHub repository, Available from: https://github.com/ssut/py-hanspell
- Howard J and Ruder S (2018). Universal language model fine-tuning for text classification, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 328-339.
- Kim J, Jang K, Lee Y, and Park W (2020). Bert-based classification model improvement through minority class data augmentation, Proceedings of the Korea Information Processing Society Conference, 27, 810-813.
- Kobayashi S (2018). Contextual augmentation: Data augmentation by words with paradigmatic relations, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), 452-457.
- Kumar V, Choudhary A, and Cho E (2020). Data augmentation using pre-trained transformer models, Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems, Suzhou, China, 18-26.
- Liu Y, Ott M, Goyal N et al (2019). RoBERTa: A robustly optimized BERT pretraining approach, Available from: arXiv, http://arxiv.org/abs/1907.11692
- Mikolov T, Chen K, Corrado G, and Dean J (2013). Efficient estimation of word representations in vector space, Available from: arXiv, https://doi.org/10.48550/arXiv.1301.3781
- Min C (2019). Korean pronunciation teaching methods for learners from the isolating language circle, specifically in Vietnam and China, Journal of Koreanology, 23, 337-371.
- Moon J, Cho W, and Lee J (2020). BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, Proceedings of the Eighth International Workshop on Natural Language Processing for Social Media, 25-31. Online. Association for Computational Linguistics.
- Park E (2015). NSMC: Naver sentiment movie corpus v1.0, GitHub repository, Available from: https://github.com/e9t/nsmc Park
- Park J (2020). KoELECTRA: Pretrained ELECTRA Model for Korean, GitHub repository, Available from: https://github.com/monologg/KoELECTRA
- Pennington J, Socher R, and Manning C (2014). GloVe: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 1532-1543.
- Qu C, Yang L, Qiu M, Croft WB, Zhang Y, and Iyyer M (2019). BERT with history answer embedding for conversational question answering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, 1133-1136.
- Radford A, Narasimhan K, Salimans T, and Sutskever I (2018). Improving language understanding by generative pre-training, Available from: https://s3-us-west-2.amazonaws.com/openai-assets/research-c\protect\@normalcr\relaxovers/language-unsupervised/language_understanding_paper.pdf
- Sennrich R, Haddow B, and Birch A (2016). Improving Neural Machine Translation Models with Monolingual Data P Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 86-96.
- SKTBrain (2019). KoBERT GitHub repository, Available from: https://github.com/SKTBrain/KoBERT
- Song Y, Wang J, Liang Z, Liu Z, and Jiang T (2020). Utilizing BERT intermediate layers for aspect based sentiment analysis and natural language inference, Available from: arXiv, https://arxiv.org/abs/2002.04815
- Vaswani A, Shazeer NM, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, and Polosukhin I (2017). Attention is All you Need. In Proceedings of Advances in Neural Information Processing Systems, Long Beach, CA, 5998-6008.
- Wei J and Zou K (2019). EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 6382-6388.
- Wu X, Lv S, Zang L, Han J, and Hu S (2019). Conditional BERT Contextual Augmentation Computational Science - ICCS 2019, 84-95.