• Title/Summary/Keyword: 대역어 부분일치

Search Result 2, Processing Time 0.016 seconds

Automatic Recognition of Translation Phrases Enclosed with Parenthesis in Korean-English Mixed Documents (한영 혼용문에서 괄호 안 대역어구의 자동 인식)

  • Lee, Jae-Sung;Seo, Young-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.9B no.4
    • /
    • pp.445-452
    • /
    • 2002
  • In Korean-English mixed documents, translated technical words are usually used with the attached full words or original words enclosed with parenthesis. In this paper, a collective method is presented to recognize and extract the translation phrases with using a base translation dictionary. In order to process the unregistered title words and translation words in the dictionary, a phonetic similarity matching method, a translation partial matching method, and a compound word matching method are newly proposed. The experiment result of each method was measured in F-measure(the alpha is set to 0.4) ; exact matching of dictionary terms as a baseline method showed 23.8%, the hybrid method of translation partial matching and phonetic similarity matching 75.9%, and the compound word matching method including the hybrid method 77.3%, which is 3.25 times better than the baseline method.

The Problem of word-for-word Translating English Thesaurus Terms into Korean (영.한 대역 시소러스의 문제점에 관한 연구)

  • Oh, Jae-Ik
    • Journal of Information Management
    • /
    • v.28 no.3
    • /
    • pp.46-73
    • /
    • 1997
  • In these days, there are many studies to improve the effectiveness of retrieval systems. One method is to use thesaurus in the systems. But the establishment of the saurus independently is very difficult and requiers a great deal of time and labor. The goal of this research is to examine which problems there are. The texts used in the thesis were "Thesaurus of ERIC descriptors(7th ed.1977)" and "KEID Thesaurus(1981)" established by translation. The results of the thesis were as follows, First, there were many differences of equivalence classes of terms between English and Korean. Second, the errors of translation made words inadequate to use as a descriptors. Third, because of differences of traditions, thought and culture, associative relationship and hierarchical relation did not coincide between terns, Fourth, because the terms of the thesaurus of ERIC descriptors serve Western usages, the terms used in Korea were not or insufficient. So it way necessary to add our own therms. As mentioned above, translating a thesaurus for use as a documentation language for indexing and retrieval is not the optimum way of building up a thesaurus.

  • PDF