References
- C. Park, Y. Yang, K. Park & H. Lim. (2020). Decoding strategies for improving low-resource machine translation. Electronics, 9(10), 1562. DOI : 10.3390/electronics9101562
- R. Sennrich, B. Haddow & A. Birch. (2015). Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909. DOI : 10.18653/v1/P16-1162
- T. Kudo. (2018). Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint arXiv:1804.10959. DOI : 10.18653/v1/P18-1007
- I. Provilkov, D. Emelianenko & E. Voita. (2019). Bpe-dropout: Simple and effective subword regularization. arXiv preprint arXiv:1910.13267. DOI : 10.18653/v1/2020.acl-main.170
- M. Schuster & K. Nakajima. (2012, March). Japanese and korean voice search. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 5149-5152. DOI : 10.1109/ICASSP.2012.6289079
- Y. Wu et al. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
- T. Kudo & J. Richardson. (2018). Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226. DOI : 10.18653/v1/D18-2012
- K. Stratos. (2017). A sub-character architecture for Korean language processing. arXiv preprint arXiv:1707.06341. DOI : 10.18653/v1/D17-1075
- S. Moon & N. Okazaki. (2020, May). Jamo Pair Encoding: Subcharacter Representation-based Extreme Korean Vocabulary Compression for Efficient Subword Tokenization. In Proceedings of The 12th Language Resources and Evaluation Conference, 3490-3497.
- C. Park, C. Lee, Y. Yang & H. Lim. (2020). Ancient Korean Neural Machine Translation. IEEE Access, 8, 116617-116625. DOI : 10.1109/ACCESS.2020.3004879
- K. Park, J. Lee, S. Jang & D. Jung. (2020). An Empirical Study of Tokenization Strategies for Various Korean NLP Tasks. arXiv preprint arXiv:2010.02534.
- P. Lison & J. Tiedemann. (2016). Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles.
- C. Park & H. Lim. (2020). A Study on the Performance Improvement of Machine Translation Using Public Korean-English Parallel Corpus. Journal of Digital Convergence, 18(6), 271-277. DOI : 10.14400/JDC.2020.18.6.271
- A. Vaswani et al. (2017). Attention is all you need. In Advances in neural information processing systems, 5998-6008.
- K. Papineni, S. Roukos, T. Ward & W. J. Zhu. (2002, July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 311-318. DOI : 10.3115/1073083.1073135