과제정보
This paper is based on the Master's thesis of the first author at Pusan National University [10].
참고문헌
- E. Alsentzer, J. R. Murphy, W. Boag, W.-H. Weng, D. Jin, T. Naumann, and M. McDermott, Publicly available clinical BERT embeddings, arXiv preprint arXiv:1904.03323 (2019).
- D. Araci, Finbert: Financial sentiment analysis with pre-trained language models, arXiv preprint arXiv:1908.10063 (2019).
- I. Beltagy, K. Lo, and A. Cohan, SciBERT: A pretrained language model for scientific text, arXiv preprint arXiv:1903.10676 (2019).
- G. G. Chowdhury, Natural language processing, Annu. Rev. Inf. Sci. Technol. 37 (2003), no. 1, 51-89. https://doi.org/10.1002/aris.1440370103
- S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman, Indexing by latent semantic analysis, J. Am. Soc. Inf. Sci. 41 (1990), no. 6, 391-407. https://doi.org/10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv:1810.04805 (2018).
- S. Gururangan, A. Marasovi'c, S. Swayamdipta, K. Lo, I. Beltagy, D. Downey, and N. A. Smith, Don't Stop Pretraining: Adapt Language Models to Domains and Tasks, arXiv preprint arXiv:2004.10964 (2020).
- KB Bank AI Team, KB-ALBERT-KO (2020) , GitHub repository, https://github.com/KB-Bank-AI/KB-ALBERT-KO.
- J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics 36 (2020), no. 4, 1234-1240. https://doi.org/10.1093/bioinformatics/btz682
- JH LEE, Korean Document Clustering by Topic Using Matrix Factorizations, Master's Thesis, Pusan National University (2021).
- E. D. Liddy, Natural language processing (2001), 4-6.
- Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, Roberta: A robustly optimized bert pretraining approach, arXiv preprint arXiv:1907.11692 (2019).
- N. Ljubesic, D. Boras, N. Bakaric, and J. Njavro, Comparing measures of semantic similarity, ITI 2008-30th Int. Conf. Inf. Technol. Interfaces (2008), 675-682.
- M. W. Mahoney and P. Drineas, CUR matrix decompositions for improved data analysis, Proc. Natl. Acad. Sci. 106 (2009), no. 3, 697-702. https://doi.org/10.1073/pnas.0803205106
- M. W. Mahoney, M. Maggioni, and P. Drineas, Tensor-CUR decompositions for tensor-based data, SIAM J. Matrix Anal. Appl. 30 (2008), no. 3, 957-987. https://doi.org/10.1137/060665336
- M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, Deep contextualized word representations, arXiv preprint arXiv:1802.05365 (2018).
- C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, Exploring the limits of transfer learning with a unified text-to-text transformer, arXiv preprint arXiv:1910.10683 (2019).
- SK T-Brain, KoBERT (2019), GitHub repository, https://github.com/SKTBrain/KoBERT.
- D. C. Sorensen and M. Embree, A deim induced cur factorization, SIAM J. Sci. Comput. 38 (2016), no. 3, A1454-A1482. https://doi.org/10.1137/140978430
- L. N. Trefethen and D. Bau III, Numerical linear algebra, SIAM 50 (1997), 25-36.