On the Analysis of Natural Language Processing Morphology for the Specialized Corpus in the Railway Domain |
Won, Jong Un
(Artificial Intelligence Railroad Research Department, Korea Railroad Research Institute)
Jeon, Hong Kyu (Artificial Intelligence Railroad Research Department, Korea Railroad Research Institute) Kim, Min Joong (Department of Systems Engineering, Ajou University) Kim, Beak Hyun (Artificial Intelligence Railroad Research Department, Korea Railroad Research Institute) Kim, Young Min (Department of Systems Engineering, Ajou University) |
1 | C.W. Park, and J.H. Song, "A Study on the Establishment of an Annotation System for Text-Based Cultural Heritage,", Journal of the Korea Academia-Industrial cooperation Society, Vol. 22, No. 11, pp. 754-759, 2021. DOI: http://doi.org/10.5762/KAIS.2021.22.11.754 DOI |
2 | E. L. Park, and S. Cho, "KoNLPy:Korean natural language processing in Python," in Proceedings of the 26th Annual Conference on Human &Cognitive Language Technology, pp. 133-136, 2014. |
3 | J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018. DOI: https://doi.org/10.48550/arXiv.1810.04805 DOI |
4 | I. Chalkidis, M. Fergadiotis, P. Malakasiotis, N. Aletras, and I. Androutsopoulos, "LEGAL-BERT: The Muppets straight out of Law School," in Findings of the Association for Computational Linguistics: EMNLP, 2898 .2904, 2020. DOI: https://doi.org/10.48550/arXiv.2010.02559 DOI |
5 | D. W. Otter, J. R. Medina, and J. K. Kalita, "A survey of the usages of deep learning for natural language processing," IEEE transactions on neural networks and learning systems, 32(2), pp. 604-624, 2020. DOI: https://doi.org/10.1109/TNNLS.2020.2979670 DOI |
6 | I. Beltagy, K. Lo, and A. Cohan, "SciBERT: A Pretrained Language Model for Scientific Text," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3615.3620, 2019. DOI: https://doi.org/10.48550/arXiv.1903.10676 DOI |
7 | S. Bird, "NLTK: the natural language toolkit" in Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp. 69-72, 2006. DOI: https://doi.org/10.48550/arXiv.cs/0205028 DOI |
8 | A. Aizawa, "An information-theoretic perspective of tf-idf measures," Information Processing & Management, 39(1), pp. 45-65, 2003. DOI: https://doi.org/10.1016/S0306-4573(02)00021-3 DOI |
9 | J. Qiang, P. Chen, T. Wang, and X. Wu, "Topic modeling over short texts by incorporating word embeddings," in Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, Cham. pp. 363-374, 2017. DOI: https://doi.org/10.48550/arXiv.1609.08496 DOI |
10 | S. Gururangan, A. Marasovic, S. Swayamdipta, K. Lo, I. Beltagy, D. Downey, and N. A. Smith, "Don't stop pretraining: adapt language models to domains and tasks," arXiv preprint arXiv:2004.10964, 2020. DOI: https://doi.org/10.48550/arXiv.2004.10964 DOI |
11 | D. Kim, D. Lee, J. Park, S. Oh, S. Kwon, I. Lee, and D. Choi, "KB-BERT: Trainning and Appliocation of Korea Pre-trained Language Model in Financial Domain,", Journal of Intelligence and Information Systems, Vol. 28, No. 2, pp. 191-206, 2022. DOI: https://dx.doi.org/10.13088/jiis.2022.28.2.191 DOI |
12 | L. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, "BioBERT: a pre-trained biomedical language representation model for biomedical text mining," Bioinformatics, Volume 36, Issue 4, pp. 1234-1240, 2020. DOI: https://doi.org/10.1093/bioinformatics/btz682 DOI |
![]() |