Browse > Article
http://dx.doi.org/10.15207/JKCS.2022.13.02.021

Deep Learning-based Korean Dialect Machine Translation Research Considering Linguistics Features and Service  

Lim, Sangbeom (Department of Software Application, Kangnam University)
Park, Chanjun (Department of Computer Science and Engineering, Korea University)
Yang, Yeongwook (Department of Computer Science and Engineering, Hanshin University)
Publication Information
Journal of the Korea Convergence Society / v.13, no.2, 2022 , pp. 21-29 More about this Journal
Abstract
Based on the importance of dialect research, preservation, and communication, this paper conducted a study on machine translation of Korean dialects for dialect users who may be marginalized. For the dialect data used, AIHUB dialect data distributed based on the highest administrative district was used. We propose a many-to-one dialect machine translation that promotes the efficiency of model distribution and modeling research to improve the performance of the dialect machine translation by applying Copy mechanism. This paper evaluates the performance of the one-to-one model and the many-to-one model as a BLEU score, and analyzes the performance of the many-to-one model in the Korean dialect from a linguistic perspective. The performance improvement of the one-to-one machine translation by applying the methodology proposed in this paper and the significant high performance of the many-to-one machine translation were derived.
Keywords
Korean Dialect Translation; Machine Translation; Transformer; Multilingual Translation; Language Convergence;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 A. Vaswani et al. (2017). Attention is all you need. Advances in neural information processing systems, 5998-6008.
2 S. R. Kudugunta, A. Bapna, I. Caswell, N. Arivazhagan, & O. Firat (2019). Investigating multilingual NMT representations at scale. arXiv preprint arXiv:1909. 02197.
3 R. Aharoni, M. Johnson, & O. Firat (2019). Massively multilingual neural machine translation. arXiv preprint arXiv:1903. 00089.
4 K. Park, Y. J. Choe & J. Ham (2019). Jejueo Datasets for Machine Translation and Speech Synthesis. arXiv preprint arXiv:1911. 12071.
5 W. Salloum & N. Habash. (2012). Elissa: A dialectal to standard Arabic machine translation system. Proceedings of COLING 2012: Demonstration Papers, 385-392.
6 I. Guellil, F. Azouaou & M. Abbas. (2017). Neural vs statistical translation of algerian arabic dialect written with arabizi and arabic letter. The 31st pacific asia conference on language, information and computation paclic
7 T. Kudo & J. Richardson (2018). Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808. 06226.
8 E. Benmamoun. (2000). The feature structure of functional categories: A comparative study of Arabic dialects. Oxford University Press.
9 C. Park, C. Lee, Y. Yang & H. Lim. (2020). Ancient Korean neural machine translation. IEEE Access, 8, 116617-116625.   DOI
10 J. Gu, Z. Lu, H. Li & V. O. K. Li. (2016). Incorporating copying mechanism in sequence-to-sequence learning. arXiv preprint arXiv:1603. 06393.
11 C. Park, K. Kim, Y. Yang, M. Kang & H. Lim. (2021). Neural spelling correction: translating incorrect sentences to correct sentences for multimedia. Multimedia Tools and Applications, 80(26), 34591-34608.   DOI
12 Y. Wan, B. Yang, D. F. Wong, L. S. Chao, H. Du & B. C. H. Ao (2020). Unsupervised Neural Dialect Translation with Commonality and Diversity Modeling. Proceedings of the AAAI Conference on Artificial Intelligence, 34(5), 9130-9137.
13 W. Farhan, B. Talafha, A. Abuammar, R. Jaikat, M. Al-Ayyoub, A. B. Tarakji & A. Toma (2020). Unsupervised dialectal neural machine translation. Information Processing & Management, 57(3), 102181.   DOI
14 M. Johnson et al. (2017). Google's multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics, 5, 339-351.   DOI
15 S. Kim (2016). A Contrastive Analysis of the Noun Structure in German, English and French. Yongbong Journal of Humanities, 49.
16 S. Lim, C. Park, J. Jo & Y. Yang (2021). Deep Learning based Korean Dialect Machine Translation Research. Proceedings of the 33th Annual Conference on Human and Cognitive Language Technology.
17 J. Devlin, M. W. Chang, K. Lee & K. Toutanova (2018). Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810. 04805
18 Korean Language Research Institute. (1988). Establishment of standard language regulation and Korean spelling. Seoul : Ministry of Education, Republic of Korea
19 K. P. Scannell. (2006). Machine translation for closely related language pairs. Proceedings of the Workshop Strategies for developing machine translation for minority languages, 103-109.