DOI QR코드

DOI QR Code

Automatic Post Editing Research

기계번역 사후교정(Automatic Post Editing) 연구

  • Park, Chan-Jun (Department of Computer Science and Engineering, Korea University) ;
  • Lim, Heui-Seok (Department of Computer Science and Engineering, Korea University)
  • Received : 2020.03.06
  • Accepted : 2020.05.20
  • Published : 2020.05.28

Abstract

Machine translation refers to a system where a computer translates a source sentence into a target sentence. There are various subfields of machine translation. APE (Automatic Post Editing) is a subfield of machine translation that produces better translations by editing the output of machine translation systems. In other words, it means the process of correcting errors included in the translations generated by the machine translation system to make proofreading. Rather than changing the machine translation model, this is a research field to improve the translation quality by correcting the result sentence of the machine translation system. Since 2015, APE has been selected for the WMT Shaed Task. and the performance evaluation uses TER (Translation Error Rate). Due to this, various studies on the APE model have been published recently, and this paper deals with the latest research trends in the field of APE.

기계번역이란 소스문장(Source Sentence)을 타겟문장(Target Sentence)으로 컴퓨터가 번역하는 시스템을 의미한다. 기계번역에는 다양한 하위분야가 존재하며 APE(Automatic Post Editing)이란 기계번역 시스템의 결과물을 교정하여 더 나은 번역문을 만들어내는 기계번역의 하위분야이다. 즉 기계번역 시스템이 생성한 번역문에 포함되어 있는 오류를 수정하여 교정문을 만드는 과정을 의미한다. 기계번역 모델을 변경하는 것이 아닌 기계번역 시스템의 결과 문장을 교정하여 번역품질을 높이는 연구분야이다. 2015년부터 WMT 공동 캠페인 과제로 선정되었으며 성능 평가는 TER(Translation Error Rate)을 이용한다. 이로 인해 최근 APE에 모델에 대한 다양한 연구들이 발표되고 있으며 이에 본 논문은 APE 분야의 최신 동향에 대해서 다루게 된다.

Keywords

References

  1. Bojar, O., Chatterjee, R., Federmann, C., Graham, Y., Haddow, B., Huck, M. & Negri, M. (2016, August). Findings of the 2016 conference on machine translation. In Proceedings of the First Conference on Machine Translation: 2, Shared Task Papers (pp. 131-198).
  2. Ondrej, B., Chatterjee, R., Christian, F., Yvette, G., Barry, H., Matthias, H. & Negri, M. (2017). Findings of the 2017 conference on machine translation (wmt17). In Second Conference onMachine Translation (pp. 169-214). The Association for Computational Linguistics.
  3. Allen, J. & Hogan, C. (2000, April). Toward the development of a post editing module for raw machine translation output: A controlled language perspective. In Third International Controlled Language Applications Workshop (CLAW-00) (pp. 62-71).
  4. Simard, M., Goutte, C. & Isabelle, P. (2007, April). Statistical phrase-based post-editing. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference (pp. 508-515).
  5. Snover, M., Dorr, B., Schwartz, R., Micciulla, L, & Makhoul, J. (2006, August). A study of translation edit rate with targeted human annotation. In Proceedings of association for machine translation in the Americas, 200(6) .
  6. Papineni, K., Roukos, S., Ward, T. & Zhu, W. J. (2002, July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics (pp. 311-318). Association for Computational Linguistics.
  7. Junczys-Dowmunt, M. & Grundkiewicz, R. (2018). Ms-uedin submission to the wmt2018 ape shared task: Dual-source transformer for automatic post-editing. arXiv preprint arXiv:1809.00188.
  8. Lopes, A. V., Farajian, M. A., Correia, G. M., Trenous, J. & Martins, A. F. (2019). Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing. arXiv preprint arXiv:1905.13068.
  9. Correia, G. M. & Martins, A. F. (2019). A simple and effective approach to automatic post-editing with transfer learning. arXiv preprint arXiv:1906.06253.
  10. Lee, W., Shin, J. & Lee, J. H. (2019, August). Transformer-based Automatic Post-Editing Model with Joint Encoder and Multi-source Attention of Decoder. In Proceedings of the Fourth Conference on Machine Translation, 3 (pp. 112-117).
  11. Negri, M., Turchi, M., Chatterjee, R. & Bertoldi, N. (2018). ESCAPE: a large-scale synthetic corpus for automatic post-editing. arXiv preprint arXiv:1803.07274.
  12. J. H. Shin, Y. K. Kim & J. H. Lee. (2019) Transformer-based Automatic Post-Editing for Machine Translation KIISE Transactions on Computing Practices, 25(1), 64-69. https://doi.org/10.5626/ktcp.2019.25.1.64
  13. Pal, S., Herbig, N., Kruger, A. & van Genabith, J. (2018, October). A Transformer-Based Multi-Source Automatic Post-Editing System. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers (pp. 827-835).
  14. Shin, J. & Lee, J. H. (2018, October). Multi-encoder Transformer Network for Automatic Post-Editing. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers (pp. 840-845).
  15. Tebbifakhr, A., Agrawal, R., Negri, M. & Turchi, M. (2018, October). Multi-source transformer with combined losses for automatic post editing. In Proceedings of the Third Conference on Machine Translation: Shared Task Papers (pp. 846-852).
  16. Libovicky, J., Helcl, J., Tlusty, M., Pecina, P. & Bojar, O. (2016). CUNI system for WMT16 automatic post-editing and multimodal translation tasks. arXiv preprint arXiv:1606.07481.
  17. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
  18. H. Shin, W. K. Lee, Y. K. Kim & J. H. Lee. (2019). Research for the Decoder Structure of Multi-encoder Transformer-based Automatic Post-Editing Model. KIISE 2019, 634-636.
  19. W. K. Lee, H. Shin, Y. K. Kim & J. H. Lee. (2019). Transformer-based Automatic Post-Editing with Effective Relation Modeling between Source and its Translations. KIISE 2019, 619-621.
  20. Lee, W., Park, J., Go, B. H. & Lee, J. H. (2019). Transformer-based Automatic Post-Editing with a Context-Aware Encoding Approach for Multi-Source Inputs. arXiv preprint arXiv:1908.05679.
  21. Chatterjee, R., Negri, M., Turchi, M., Blain, F., & Specia, L. (2018, March). Combining quality estimation and automatic post-editing to enhance machine translation output. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas, 1, (pp. 26-38).