• Title/Summary/Keyword: Translation Domain Adaptation

Search Result 7, Processing Time 0.019 seconds

Environment for Translation Domain Adaptation and Continuous Improvement of English-Korean Machine Translation System

  • Kim, Sung-Dong;Kim, Namyun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.127-136
    • /
    • 2020
  • This paper presents an environment for rule-based English-Korean machine translation system, which supports the translation domain adaptation and the continuous translation quality improvement. For the purposes, corpus is essential, from which necessary information for translation will be acquired. The environment consists of a corpus construction part and a translation knowledge extraction part. The corpus construction part crawls news articles from some newspaper sites. The extraction part builds the translation knowledge such as newly-created words, compound words, collocation information, distributional word representations, and so on. For the translation domain adaption, the corpus for the domain should be built and the translation knowledge should be constructed from the corpus. For the continuous improvement, corpus needs to be continuously expanded and the translation knowledge should be enhanced from the expanded corpus. The proposed web-based environment is expected to facilitate the tasks of domain adaptation and translation system improvement.

Domain Adaptation Method for LHMM-based English Part-of-Speech Tagger (LHMM기반 영어 형태소 품사 태거의 도메인 적응 방법)

  • Kwon, Oh-Woog;Kim, Young-Gil
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.10
    • /
    • pp.1000-1004
    • /
    • 2010
  • A large number of current language processing systems use a part-of-speech tagger for preprocessing. Most language processing systems required a tagger with the highest possible accuracy. Specially, the use of domain-specific advantages has become a hot issue in machine translation community to improve the translation quality. This paper addresses a method for customizing an HMM or LHMM based English tagger from general domain to specific domain. The proposed method is to semi-automatically customize the output and transition probabilities of HMM or LHMM using domain-specific raw corpus. Through the experiments customizing to Patent domain, our LHMM tagger adapted by the proposed method shows the word tagging accuracy of 98.87% and the sentence tagging accuracy of 78.5%. Also, compared with the general tagger, our tagger improved the word tagging accuracy of 2.24% (ERR: 66.4%) and the sentence tagging accuracy of 41.0% (ERR: 65.6%).

Comparison of structure, function and regulation of plant cold shock domain proteins to bacterial and animal cold shock domain proteins

  • Chaikam, Vijay;Karlson, Dale T.
    • BMB Reports
    • /
    • v.43 no.1
    • /
    • pp.1-8
    • /
    • 2010
  • The cold shock domain (CSD) is among the most ancient and well conserved nucleic acid binding domains from bacteria to higher animals and plants. The CSD facilitates binding to RNA, ssDNA and dsDNA and most functions attributed to cold shock domain proteins are mediated by this nucleic acid binding activity. In prokaryotes, cold shock domain proteins only contain a single CSD and are termed cold shock proteins (Csps). In animal model systems, various auxiliary domains are present in addition to the CSD and are commonly named Y-box proteins. Similar to animal CSPs, plant CSPs contain auxiliary C-terminal domains in addition to their N-terminal CSD. Cold shock domain proteins have been shown to play important roles in development and stress adaptation in wide variety of organisms. In this review, the structure, function and regulation of plant CSPs are compared and contrasted to the characteristics of bacterial and animal CSPs.

Korean Semantic Role Labeling Using Domain Adaptation Technique (도메인 적응 기술을 이용한 한국어 의미역 인식)

  • Lim, Soojong;Bae, Yongjin;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.56-60
    • /
    • 2014
  • 기계학습 방법에 기반한 자연어 분석은 학습 데이터가 필요하다. 학습 데이터가 구축된 소스 도메인이 아닌 다른 도메인에 적용할 경우 한국어 의미역 인식 기술은 15% 정도 성능 하락이 발생한다. 본 논문은 이러한 다른 도메인에 적용시 발생하는 성능 하락 현상을 극복하기 위해서 기존의 소스 도메인 학습 데이터를 활용하여, 소규모의 타겟 도메인 학습 데이터 구축만으로도 성능 하락을 최소화하기 위해 한국어 의미역 인식 기술에 prior 모델을 제안하며 기존의 도메인 적응 알고리즘과 비교 실험하였다. 추가적으로 학습 데이터에 사용되는 자질 중에서, 형태소 태그와 구문 태그의 자질 값을 기존보다 단순하게 적용하여 성능의 변화를 실험하였다.

  • PDF

The Korean language version of Stroke Impact Scale 3.0: Cross-cultural adaptation and translation

  • Lee, Hae-jung;Song, Ju-min
    • Journal of the Korean Society of Physical Medicine
    • /
    • v.10 no.3
    • /
    • pp.47-55
    • /
    • 2015
  • PURPOSE: Stoke is one of most common disabling conditions and it is still lacking of measuring patient's functioning level. The aim of the study was to develop Korean language version of stroke impact scale 3.0. METHODS: Korean version of stroke impact scale 3.0 was developed in idiomatic modern Korean with a standard protocol of multiple forward and backward translations and an expert reviews to achieve equivalence with the original English version. Interviews with clinicians who were currently managing patients with stroke were also conducted for language evaluation. A reliability test was performed to make final adaptation using a pre-final version. To assess the reliability of the translated questionnaire, the intraclass correlation coefficient (ICC) was calculated for each domain of the scale. RESULTS: Thirty subjects (16 male, 14 female) aged from 20 to 75 years old participated to review the translated questionnaire. Reliability of each domain of the questionnaire was found to be good in strength (ICC=0.74), ADL (ICC=0.81), mobility (ICC=0.90), hand function (ICC=0.80) and social participation (ICC=0.79), communication (ICC=0.77) with total (ICC=0.76). However, domains of memory and thinking (ICC=0.66), and emotion (ICC=0.27) and showed poor reliability. CONCLUSION: This study indicates that the Korean version of SIS 3.0 was successfully developed. Future study needed for obtaining the validity of the Korean version of SIS 3.0.

A Study of Semantic Role Labeling using Domain Adaptation Technique for Question (도메인 적응 기술 기반 질문 문장에 대한 의미역 인식 연구)

  • Lim, Soojong;Kim, Hyunki
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.246-249
    • /
    • 2015
  • 기계학습 방법에 기반한 자연어 분석은 학습 데이터가 필요하다. 학습 데이터가 구축된 소스 도메인이 아닌 다른 도메인에 적용할 경우 한국어 의미역 인식 기술은 10% 정도 성능 하락이 발생한다. 본 논문은 기존 도메인 적응 기술을 이용하여 도메인이 다르고, 문장의 형태도 다를 경우에 도메인 적응 알고리즘을 적용하여, 질의응답 시스템에서 필요한 질문 문장 의미역 인식을 위해, 소규모의 질문 문장에 대한 학습 데이터 구축만으로도 한국어 질문 문장에 대해 성능을 향상시키기 위한 방법을 제안한다. 한국어 의미역 인식 기술에 prior 모델을 제안한다. 제안하는 방법은 실험결과 소스 도메인 데이터만 사용한 실험보다 9.42, 소스와 타겟 도메인 데이터를 단순 합하여 학습한 경우보다 2.64의 성능향상을 보였다.

  • PDF

Mechanism of the natural product moracin-O derived MO-460 and its targeting protein hnRNPA2B1 on HIF-1α inhibition

  • Soung, Nak-Kyun;Kim, Hye-Min;Asami, Yukihiro;Kim, Dong Hyun;Cho, Yangrae;Naik, Ravi;Jang, Yerin;Jang, Kusic;Han, Ho Jin;Ganipisetti, Srinivas Rao;Cha-Molstad, Hyunjoo;Hwang, Joonsung;Lee, Kyung Ho;Ko, Sung-Kyun;Jang, Jae-Hyuk;Ryoo, In-Ja;Kwon, Yong Tae;Lee, Kyung Sang;Osada, Hiroyuki;Lee, Kyeong;Kim, Bo Yeon;Ahn, Jong Seog
    • Experimental and Molecular Medicine
    • /
    • v.51 no.2
    • /
    • pp.1.1-1.14
    • /
    • 2019
  • Hypoxia-inducible factor-$1{\alpha}$ ($HIF-1{\alpha}$) mediates tumor cell adaptation to hypoxic conditions and is a potentially important anticancer therapeutic target. We previously developed a method for synthesizing a benzofuran-based natural product, (R)-(-)-moracin-O, and obtained a novel potent analog, MO-460 that suppresses the accumulation of $HIF-1{\alpha}$ in Hep3B cells. However, the molecular target and underlying mechanism of action of MO-460 remained unclear. In the current study, we identified heterogeneous nuclear ribonucleoprotein A2B1 (hnRNPA2B1) as a molecular target of MO-460. MO-460 inhibits the initiation of $HIF-1{\alpha}$ translation by binding to the C-terminal glycinerich domain of hnRNPA2B1 and inhibiting its subsequent binding to the 3'-untranslated region of $HIF-1{\alpha}$ mRNA. Moreover, MO-460 suppresses $HIF-1{\alpha}$ protein synthesis under hypoxic conditions and induces the accumulation of stress granules. The data provided here suggest that hnRNPA2B1 serves as a crucial molecular target in hypoxiainduced tumor survival and thus offer an avenue for the development of novel anticancer therapies.