Acknowledgement
This work was supported by an Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (RS-2023-00216011, Development of artificial complex intelligence for conceptually understanding and inferring like human). We would like to thank Editage (www. editage.co.kr) and Soomgo (soomgo.com) for English language editing.
References
- T. Mikolov, K. Chen, G. Corrado, and J. Dean, Efficient estimation of word representations in vector space, (1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA), May 2-4, 2013, 2013.
- H.-J. Song, Subword tokenization and Korean morphological analysis, Commun. KIISE 39 (2021), no. 4, 15-20.
- Y. Choi and K. J. Lee, Performance analysis of Korean morphological analyzer based on transformer and BERT, J. KIISE 47 (2020), no. 8, 730-741. https://doi.org/10.5626/JOK.2020.47.8.730
- E. Chung and J.-G. Park, Word segmentation and POS tagging using Seq2seq attention model, (Proceedings of the 28th Annual Conference on Human and Cognitive Language Technology), 2016, pp. 217-219.
- H. Hwang and C. Lee, Korean morphological analysis using sequence-to-sequence learning with copying mechanism, (Proceedings of the 43rd Winter Congress of the KIISE), 2016, pp. 443-445.
- H. Hwang and C. Lee, Linear-time Korean morphological analysis using an action-based local monotonic attention mechanism, ETRI J. 42 (2020), no. 1, 101-107. https://doi.org/10.4218/etrij.2018-0456
- H. Kim, S. Park, and H. Kim, Joint model of morphological analysis and named entity recognition using shared layer, J. KIISE 48 (2021), no. 2, 167-173. https://doi.org/10.5626/JOK.2021.48.2.167
- H. Kim, S. Yang, and Y. Ko, How to utilize syllable distribution patterns as the input of LSTM for Korean morphological analysis, Pattern Recogn. Lett. 120 (2019), 39-45. https://doi.org/10.1016/j.patrec.2018.12.019
- H. Kim, J. Yoon, J. An, K. Bae, and Y. Ko, Syllable-based Korean POS tagging using POS distribution and bidirectional LSTM CRFs, (Proceedings of the 28th Annual Conference on Human and Cognitive Language Technology), 2016, pp. 3-8.
- J. Kim, S. Kang, and H. Kim, Korean head-tail tokenization and part-of-speech tagging by using deep learning, IEMEK J. Embedded Syst. Appl. 17 (2022), no. 4, 199-208.
- S.-W. Kim and S.-P. Choi, Research on joint models for Korean word spacing and POS (part-of-speech) tagging based on bidirectional LSTM-CRF, J. KIISE 45 (2018), no. 8, 792-800. https://doi.org/10.5626/JOK.2018.45.8.792
- H.-C. Kwon, A dictionary-based morphological analysis, (Proc. of NLPRS'91), 1991, pp. 178-185.
- C. Lee, Joint models for Korean word spacing and POS tagging using structural SVM, J. KISS: Softw. Appl. 40 (2013), no. 12, 826-832.
- C.-H. Lee, J.-H. Lim, S. Lim, and H.-K. Kim, Syllable-based Korean POS tagging based on combining a pre-analyzed dictionary with machine learning, J. KIISE 43 (2016), no. 3, 362-369. https://doi.org/10.5626/JOK.2016.43.3.362
- D.-G. Lee and H.-C. Rim, Probabilistic modeling of Korean morphology, IEEE Trans. Audio, Speech, Lang. Process. 17 (2009), no. 5, 945-955. https://doi.org/10.1109/TASL.2009.2019922
- J. S. Lee, Three-step probabilistic model for Korean morphological analysis, J. KISS: Softw. Appl. 38 (2011), no. 5, 257-268.
- J. Li, E. Lee, and J.-H. Lee, Sequence-to-sequence based morphological analysis and part-of-speech tagging for Korean language with convolutional features, J. KIISE 44 (2017), no. 1, 57-62. https://doi.org/10.5626/JOK.2017.44.1.57
- J.-W. Min, S.-H. Na, J.-H. Sin, and Y.-K. Kim, Dynamic oracle for neural transition-based morpheme segmentation and POS tagging of Korean, (Proceedings of the 30th Annual Conference on Human and Cognitive Language Technology), 2018, pp. 413-416.
- J. Min, S.-H. Na, J.-H. Shin, and Y.-K. Kim, End-to-end neural transition-based morpheme segmentation and POSTagging of Korean, (Proceedings of the Korea Computer Congress), 2019, pp. 566-568.
- J. Min, S.-H. Na, J.-H. Shin, and Y.-K. Kim, Stack pointer network for Korean morphological analysis, (Proceedings of the Korea Computer Congress), 2020, pp. 371-373.
- J. Min, S.-H. Na, J.-H. Shin, and Y.-K. Kim, Interleaved decoder in sequence-to-sequence model for morphological analysis and part-of-speech tagging of Korean, (Proceedings of the Korea Computer Congress), 2022, pp. 467-469.
- S.-H. Na, Conditional random fields for Korean morpheme segmentation and POS tagging, ACM Trans. Asian Low-Resource Lang. Inform. Process. 14 (2015), no. 3, 1-16. https://doi.org/10.1145/2700051
- S.-H. Na, C.-H. Kim, and Y.-K. Kim, Lattice-based discriminative approach for Korean morphological analysis, J. KISS: Softw. Appl. 41 (2014), no. 7, 523-532.
- S.-H. Na and Y.-K. Kim, Phrase-based statistical model for Korean morpheme segmentation and POS tagging, IEICE Trans. Inform. Syst. 101 (2018), no. 2, 512-522. https://doi.org/10.1587/transinf.2017EDP7085
- Y. Seok Choi and K. J. Lee, A reranking model for Korean morphological analysis based on sequence-to-sequence model, KIPS Trans. Softw. Data Eng. 7 (2018), no. 4, 121-128.
- K. Shim, Syllable-based POS tagging without Korean morphological analysis, Korean J. Cognit. Sci. 22 (2011), no. 3, 327-345. https://doi.org/10.19066/cogsci.2011.22.3.005
- H. J. Shin, J. Park, and J. S. Lee, Syllable-based multi-POSMORPH annotation for Korean morphological analysis and part-of-speech tagging, Appl. Sci. 13 (2023), no. 5, 2892.
- J.-C. Shin and C.-Y. Ock, A Korean morphological analyzer using a pre-analyzed partial word-phrase dictionary, J. KISS: Softw. Appl. 39 (2012), no. 5, 415-424.
- H.-J. Song and S.-B. Park, Korean morphological analysis with tied sequence-to-sequence multi-task model, (Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China), 2019, pp. 1436-1441.
- H.-J. Song and S.-B. Park, Korean part-of-speech tagging based on morpheme generation, ACM Trans. Asian Low-Resource Lang. Inform. Process. 19 (2020), no. 3, 1-10. https://doi.org/10.1145/3365679
- J. Y. Youn and J. S. Lee, A deep learning-based two-steps pipeline model for Korean morphological analysis and partof-speech tagging, J. KIISE 48 (2021), no. 4, 444-452. https://doi.org/10.5626/JOK.2021.48.4.444
- T. Kudo, MeCab: yet another part-of-speech and morphological analyzer [Online]. Available: https://taku910.github.io/mecab/. (accessed 2023, Aug. 25).
- T. Kudo, K. Yamamoto, and Y. Matsumoto, Applying conditional random fields to Japanese morphological analysis, (Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain), 2004, pp. 230-237.
- Y. Bae, H. Kim, J.-H. Lim, H. Ki Kim, and K. J. Lee, 2-Phase passage re-ranking model based on neural-symbolic ranking models, J. KIISE 48 (2021), no. 5, 501-509. https://doi.org/10.5626/JOK.2021.48.5.501
- R. Nogueira, W. Yang, K. Cho, and J. Lin, Multi-stage document ranking with BERT, arXiv preprint, 2019. DOI 10.48550/ arXiv.1910.14424 .
- M. Choe and B. Kang, Practice in constructing Sejong morph (sense) analysis Corpora, Korean Cult. Stud. 48 (2008), 337-372.
- University of Ulsan, UCorpus-HG: Morph-sense tagged Corpus [Online]. Available: http://nlplab.ulsan.ac.kr/doku.php?id=ucorpus. (accessed 2023, Aug. 25).
- I. Kim, D.-G. Lee, and B. Kang, SJ-RIKS Corpus: beyond 21st Sejong morph-sense tagged corpus, Korean Cult. Stud. 52 (2010), 373-403.
- National Institute of Korean Language, Everyone's Corpus [Online]. Available: https://corpus.korean.go.kr. (accessed 2023, Aug. 25).
- I. Kim: Conducting Korean POS tagged corpus. Project Report 11-1371028-000776-01. National Institute of Korean Language, 2019. (in Korean).
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: pretraining of deep bidirectional transformers for language understanding, (Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies), 2019, pp. 4171- 4186.
- Korea Press Foundation, KPF BERT [Online]. Available: https://github.com/KPFBERT/kpfbert. (accessed 2023, Dec. 4).
- Electronics and Telecommunications Research Institute, Kor-BERT [Online]. Available: https://aiopen.etri.re.kr/bertModel. (accessed 2023, Dec. 4).