Browse > Article
http://dx.doi.org/10.15207/JKCS.2020.11.9.007

Neural Machine translation specialized for Coronavirus Disease-19(COVID-19)  

Park, Chan-Jun (Department of Computer Science and Engineering, Korea University)
Kim, Kyeong-Hee (Department of South East Asia, Busan University of Foreign Studies)
Park, Ki-Nam (Creative Information and Computer Institute, Korea University)
Lim, Heui-Seok (Department of Computer Science and Engineering, Korea University)
Publication Information
Journal of the Korea Convergence Society / v.11, no.9, 2020 , pp. 7-13 More about this Journal
Abstract
With the recent World Health Organization (WHO) Declaration of Pandemic for Coronavirus Disease-19 (COVID-19), COVID-19 is a global concern and many deaths continue. To overcome this, there is an increasing need for sharing information between countries and countermeasures related to COVID-19. However, due to linguistic boundaries, smooth exchange and sharing of information has not been achieved. In this paper, we propose a Neural Machine Translation (NMT) model specialized for the COVID-19 domain. Centering on English, a Transformer based bidirectional model was produced for French, Spanish, German, Italian, Russian, and Chinese. Based on the BLEU score, the experimental results showed significant high performance in all language pairs compared to the commercialization system.
Keywords
Machine Translation; Artificial Intelligence; Coronavirus Disease-19; Transformer; Deep Learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Bahdanau, D., Cho, K. & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
2 Sutskever, I., Vinyals, O. & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).
3 Kalchbrenner, N., Espeholt, L., Simonyan, K., Oord, A. V. D., Graves, A. & Kavukcuoglu, K. (2016). Neural machine translation in linear time. arXiv preprint arXiv:1610.10099.
4 Gehring, J., Auli, M., Grangier, D., Yarats, D. & Dauphin, Y. N. (2017, August). Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 (pp. 1243-1252). JMLR. org.
5 Vaswani, A. et al. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
6 Lample, G. & Conneau, A. (2019). Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291.
7 Song, K., Tan, X., Qin, T., Lu, J. & Liu, T. Y. (2019). Mass: Masked sequence to sequence pre-training for language generation. arXiv preprint arXiv:1905.02450.
8 Covid, C. D. C. & Team, R. (2020). Severe outcomes among patients with coronavirus disease 2019 (COVID-19)-United States, February 12-March 16, 2020. MMWR Morb Mortal Wkly Rep, 69(12), 343-346.   DOI
9 Sohrabi, C. et al. (2020). World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19). International Journal of Surgery.
10 Liu, Y. et al. (2020). Multilingual denoising pre-training for neural machine translation. arXiv preprint arXiv:2001.08210.
11 Kudo, T. & Richardson, J. (2018). Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226.
12 Koehn, P., Och, F. J. & Marcu, D. (2003, May). Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 48-54). Association for Computational Linguistics.
13 Papineni, K., Roukos, S., Ward, T. & Zhu, W. J. (2002, ㄴ July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics (pp. 311-318). Association for Computational Linguistics.
14 Kasher, A. (Ed.). (2012). Language in focus: foundations, methods and systems: essays in memory of Yehoshua Bar-Hillel (Vol. 43). Springer Science & Business Media.
15 Dugast, L., Senellart, J. & Koehn, P. (2007, June). Statistical Post-Editing on SYSTRAN's Rule-Based Translation System. In Proceedings of the Second Workshop on Statistical Machine Translation (pp. 220-223).