• Title/Summary/Keyword: English-Korean Machine Translation

Search Result 129, Processing Time 0.03 seconds

A Research on Test Suites for Machine Translation Systems. (기계번역 시스템 측정 장치 연구)

  • Lee, Min-Haeng;Jee, Kwang-Sin;Chung, So-Woo
    • Language and Information
    • /
    • v.2 no.2
    • /
    • pp.185-220
    • /
    • 1998
  • The purpose of this research is to propose a set of basic guidelines for the construction of English test suites, a set of basic guidelines for the construction of Korean test suites to objectively evaluate the performance of machine translation systems. For this end, we constructed 650 English test sentences, 650 Korean test sentences, and developed the statistical methods and tools for the comparative evaluation of the English-Korean machine translation systems. It also evaluates the existing commercial English-Korean machine translation systems. The importance of this research lies in that it will promote an awareness of the importance and need of testing machine translation systems within the Natural Language Community. This research will also make a big contribution to the development of evaluation methods and techniques for appropriate test suites for Korean information processing systems. The results of this research can be used by the natural language community to test the performance and development of their information processing systems or machine translation systems.

  • PDF

Environment for Translation Domain Adaptation and Continuous Improvement of English-Korean Machine Translation System

  • Kim, Sung-Dong;Kim, Namyun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.127-136
    • /
    • 2020
  • This paper presents an environment for rule-based English-Korean machine translation system, which supports the translation domain adaptation and the continuous translation quality improvement. For the purposes, corpus is essential, from which necessary information for translation will be acquired. The environment consists of a corpus construction part and a translation knowledge extraction part. The corpus construction part crawls news articles from some newspaper sites. The extraction part builds the translation knowledge such as newly-created words, compound words, collocation information, distributional word representations, and so on. For the translation domain adaption, the corpus for the domain should be built and the translation knowledge should be constructed from the corpus. For the continuous improvement, corpus needs to be continuously expanded and the translation knowledge should be enhanced from the expanded corpus. The proposed web-based environment is expected to facilitate the tasks of domain adaptation and translation system improvement.

Customizing an English-Korean Machine Translation System for Patent Translation

  • Choi, Sung-Kwon;Kim, Young-Gil
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.105-114
    • /
    • 2007
  • This paper addresses a method for customizing an English-to-Korean machine translation system from general domain to patent domain. The customizing method consists of following steps: 1) linguistically studying about characteristics of patent documents, 2) extracting unknown words from large patent documents and constructing large bilingual terminology, 3) extracting and constructing the patent-specific translation patterns 4) customizing the translation engine modules of the existing general MT system according to linguistic study about characteristics of patent documents, and 5) evaluating the accuracy of translation modules and the translation quality. This research was performed under the auspices of the MIC (Ministry of Information and Communication) of Korean government during 2005-2006. The translation accuracy of the customized English-Korean patent translation system is 82.43% on the average in 5 patent fields (machinery, electronics, chemistry, medicine and computer) according to the evaluation of 7 professional human translators. In 2006, the patent MT system started an on-line patent MT service in IPAC (International Patent Assistance Center) under MOCIE (Ministry of Commerce, Industry and Energy) in Korea. In 2007, KIPO (Korean Intellectual Property Office) tries to launch an English-Korean patent MT service.

  • PDF

A Quality Comparison of English Translations of Korean Literature between Human Translation and Post-Editing

  • LEE, IL-JAE
    • International Journal of Advanced Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.165-171
    • /
    • 2018
  • As the artificial intelligence (AI) plays a crucial role in machine translation (MT) which has loomed large as a new translation paradigm, concerns have also arisen if MT can produce a quality product as human translation (HT) can. In fact, several MT experimental studies report cases in which the MT product called post-editing (PE) as equally as HT or often superior ([1],[2],[6]). As motivated from those studies on translation quality between HT and PE, this study set up an experimental situation in which Korean literature was translated into English, comparatively, by 3 translators and 3 post-editors. Afterwards, a group of 3 other Koreans checked for accuracy of HT and PE; a group of 3 English native speakers scored for fluency of HT and PE. The findings are (1) HT took the translation time, at least, twice longer than PE. (2) Both HT and PE produced similar error types, and Mistranslation and Omission were the major errors for accuracy and Grammar for fluency. (3) HT turned to be inferior to PE for both accuracy and fluency.

English-to-Korean Machine Translation and the Problem of Anaphora Resolution (영한기계번역과 대용어 조응문제에 대한 고찰)

  • Ruslan Mitkov
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.351-357
    • /
    • 1994
  • At least two projects for English-to-Korean translation have been already in action for the last few years, but so far no attention has been paid to the problem of resolving pronominal reference and a default pronoun translation has been considered instead. In this paper we argue that pronous cannot be handled trivially in an English-to-Korean translation and one cannot bypass the task of resolving anaphoric reference if aiming at good and natural translation. In addition, we propose lexical transfer rules for English-to-Korean anaphor translation and outline an anaphora resolution model for an English-to-Korean MT system in operation.

  • PDF

Probabilistic Part-Of-Speech Determination for Efficient English-Korean Machine Translation (효율적 영한기계번역을 위한 확률적 품사결정)

  • Kim, Sung-Dong;Kim, Il-Min
    • The KIPS Transactions:PartB
    • /
    • v.17B no.6
    • /
    • pp.459-466
    • /
    • 2010
  • Natural language processing has several ambiguity problems, and English-Korean machine translation especially includes those problems to be solved in each translation step. This paper focuses on resolving part-of-speech ambiguity of English words in order to improve the efficiency of English analysis, which is in part of efforts for developing practical English-Korean machine translation system. In order to improve the efficiency of the English analysis, the part-of-speech determination must be fast and accurate for being integrated with machine translation system. This paper proposes the probabilistic models for part-of-speech determination. We use Penn Treebank corpus in building the probabilistic models. In experiment, we present the performance of the part-of-speech determination models and the efficiency improvement of the machine translation system by the proposed part-of-speech determination method.

A Satisfaction Survey on the Human Translation Outcomes and Machine Translation Post-Editing Outcomes

  • Hong, Junghee;Lee, Il Jae
    • International journal of advanced smart convergence
    • /
    • v.10 no.2
    • /
    • pp.86-96
    • /
    • 2021
  • This cross-sectional survey research carried out with the inquisitive agenda on satisfaction of the translation outcomes as performed by human translation and (machine translation) post-editing. The survey group consisted of 166 Korean translators primarily working with the English, Chinese, and Japanese languages. They were asked to rate the satisfactory level with accuracy, fluency, idiomatic expression, and terminology in the Richter's scale of four. The result reveals that human translation is more satisfactory than post-editing with respect to accuracy, but it is uneasy to assert that accuracy is unsatisfactory in post-editing. On the other hand, the Korean translators are less satisfied with fluency, idiomatic expression, and terminology than accuracy. It can be assumed that although human translation is more satisfactory than post-editing, the accuracy of post-editing seems to be more acknowledged than fluency, idiomatic expression, and terminology, which lead the translators to take the accuracy of raw machine-translation products and to go on to improve the fluency, idiomatic expression, and terminology. Nevertheless, Korean translators believe Korean idiomatic expressions cannot be satisfactorily produced in post-editing, while fluency and terminology can be improved in post-editing.

A Study on the Performance Improvement of Machine Translation Using Public Korean-English Parallel Corpus (공공 한영 병렬 말뭉치를 이용한 기계번역 성능 향상 연구)

  • Park, Chanjun;Lim, Heuiseok
    • Journal of Digital Convergence
    • /
    • v.18 no.6
    • /
    • pp.271-277
    • /
    • 2020
  • Machine translation refers to software that translates a source language into a target language, and has been actively researching Neural Machine Translation through rule-based and statistical-based machine translation. One of the important factors in the Neural Machine Translation is to extract high quality parallel corpus, which has not been easy to find high quality parallel corpus of Korean language pairs. Recently, the AI HUB of the National Information Society Agency(NIA) unveiled a high-quality 1.6 million sentences Korean-English parallel corpus. This paper attempts to verify the quality of each data through performance comparison with the data published by AI Hub and OpenSubtitles, the most popular Korean-English parallel corpus. As test data, objectivity was secured by using test set published by IWSLT, official test set for Korean-English machine translation. Experimental results show better performance than the existing papers tested with the same test set, and this shows the importance of high quality data.

Machine Translation of Korean-to-English spoken language Based on Semantic Patterns (의미패턴에 기반한 대화체 한영 기계 번역)

  • Jung, Cheon-Young;Seo, Young-Hoon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.9
    • /
    • pp.2361-2368
    • /
    • 1998
  • This paper analyzes Korean spoken language and describes the machine translation o[ Korean to-English spoken language based on semantic patterns, In Korean-to-English machine translation. ambiguity of Korean sentence analysis using syntactic information can be resolved by semantic patterns, Therefore, for machine translation of spoken language, we estabilish the system based on semantic patterns extracted from Korean scheduling domain, This system obtains the robustness by skip ability of syllables in analysis of Korean sentence and we add options to semantic patterns in order to reduce pattern numbers, The data used [or the experiment are scheduling domain and performance of Korean-to-English translation is 88%.

  • PDF

English-Korean Transfer Dictionary Extension Tool in English-Korean Machine Translation System (영한 기계번역 시스템의 영한 변환사전 확장 도구)

  • Kim, Sung-Dong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.1
    • /
    • pp.35-42
    • /
    • 2013
  • Developing English-Korean machine translation system requires the construction of information about the languages, and the amount of information in English-Korean transfer dictionary is especially critical to the translation quality. Newly created words are out-of-vocabulary words and they appear as they are in the translated sentence, which decreases the translation quality. Also, compound nouns make lexical and syntactic analysis complex and it is difficult to accurately translate compound nouns due to the lack of information in the transfer dictionary. In order to improve the translation quality of English-Korean machine translation, we must continuously expand the information of the English-Korean transfer dictionary by collecting the out-of-vocabulary words and the compound nouns frequently used. This paper proposes a method for expanding of the transfer dictionary, which consists of constructing corpus from internet newspapers, extracting the words which are not in the existing dictionary and the frequently used compound nouns, attaching meaning to the extracted words, and integrating with the transfer dictionary. We also develop the tool supporting the expansion of the transfer dictionary. The expansion of the dictionary information is critical to improving the machine translation system but requires much human efforts. The developed tool can be useful for continuously expanding the transfer dictionary, and so it is expected to contribute to enhancing the translation quality.