Browse > Article
http://dx.doi.org/10.9728/dcs.2018.19.6.1161

Symbolizing Numbers to Improve Neural Machine Translation  

Kang, Cheongwoong (School of Computer Science and Electrical Engineering, Handong Global University)
Ro, Youngheon (School of Computer Science and Electrical Engineering, Handong Global University)
Kim, Jisu (School of Computer Science and Electrical Engineering, Handong Global University)
Choi, Heeyoul (School of Computer Science and Electrical Engineering, Handong Global University)
Publication Information
Journal of Digital Contents Society / v.19, no.6, 2018 , pp. 1161-1167 More about this Journal
Abstract
The development of machine learning has enabled machines to perform delicate tasks that only humans could do, and thus many companies have introduced machine learning based translators. Existing translators have good performances but they have problems in number translation. The translators often mistranslate numbers when the input sentence includes a large number. Furthermore, the output sentence structure completely changes even if only one number in the input sentence changes. In this paper, first, we optimized a neural machine translation model architecture that uses bidirectional RNN, LSTM, and the attention mechanism through data cleansing and changing the dictionary size. Then, we implemented a number-processing algorithm specialized in number translation and applied it to the neural machine translation model to solve the problems above. The paper includes the data cleansing method, an optimal dictionary size and the number-processing algorithm, as well as experiment results for translation performance based on the BLEU score.
Keywords
Neural Machine Translation; Number Translation; Mistranslation; Symbolization; Model Optimization;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Y. Koren, R. Bell, and C. Volinsky, "Matrix factorization techniques for recommender systems," IEEE Computer Society, Vol. 42, No. 8, pp.30-37, August 2009.
2 H. Choi, Y. Kang, and M. Kang, "Pet shop recommendation system based on implicit feedback," Journal of Digital Contents Society, Vol. 18, No. 1, pp. 1-4, 2017.   DOI
3 Y. LeCun, Y. Bengio, and G. E. Hinton, "Deep learning," Nature, Vol. 521, No. 7553, pp. 436-444, 2015.   DOI
4 A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Neural Information Processing Systems, pp. 1097-1105, 2012.
5 G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbur, "Deep neural networks for acoustic modeling in speech recognition," IEEE Signal processing magazine, pp.82-97, 2012.
6 R. Sennrich, B. Haddow, and A. Birch. " Neural machine translation ofrare words with subword units," arXivpreprint arXiv:1508.07909, 2015.
7 K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, and Y. Bengio, "Show, attend and tell: Neural Image Caption Generation with Visual Attention," in Proceeding of the International Conference on Machine Learning, Lille: France, pp. 2048-2057, July 2015.
8 I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to Sequence Learning with Neural Networks", NIPS 2014.
9 D. Bahdanau, K. Cho, and Y. Bengio, "Neural Machine Translation by Jointly Learning to Align and Translate", ICLR 2015.
10 H. Choi, K. Cho, and Y. Bengio, "Context-dependent word representation for neural machine translation", Computer Speech & Language, Vol. 45, p. 149-160, 2017.   DOI