• 제목/요약/키워드: Translation error

검색결과 139건 처리시간 0.021초

Self-Attention 시각화를 사용한 기계번역 서비스의 번역 오류 요인 설명 (Explaining the Translation Error Factors of Machine Translation Services Using Self-Attention Visualization)

  • 장청롱;안현철
    • 한국IT서비스학회지
    • /
    • 제21권2호
    • /
    • pp.85-95
    • /
    • 2022
  • This study analyzed the translation error factors of machine translation services such as Naver Papago and Google Translate through Self-Attention path visualization. Self-Attention is a key method of the Transformer and BERT NLP models and recently widely used in machine translation. We propose a method to explain translation error factors of machine translation algorithms by comparison the Self-Attention paths between ST(source text) and ST'(transformed ST) of which meaning is not changed, but the translation output is more accurate. Through this method, it is possible to gain explainability to analyze a machine translation algorithm's inside process, which is invisible like a black box. In our experiment, it was possible to explore the factors that caused translation errors by analyzing the difference in key word's attention path. The study used the XLM-RoBERTa multilingual NLP model provided by exBERT for Self-Attention visualization, and it was applied to two examples of Korean-Chinese and Korean-English translations.

기계번역 사후교정(Automatic Post Editing) 연구 (Automatic Post Editing Research)

  • 박찬준;임희석
    • 한국융합학회논문지
    • /
    • 제11권5호
    • /
    • pp.1-8
    • /
    • 2020
  • 기계번역이란 소스문장(Source Sentence)을 타겟문장(Target Sentence)으로 컴퓨터가 번역하는 시스템을 의미한다. 기계번역에는 다양한 하위분야가 존재하며 APE(Automatic Post Editing)이란 기계번역 시스템의 결과물을 교정하여 더 나은 번역문을 만들어내는 기계번역의 하위분야이다. 즉 기계번역 시스템이 생성한 번역문에 포함되어 있는 오류를 수정하여 교정문을 만드는 과정을 의미한다. 기계번역 모델을 변경하는 것이 아닌 기계번역 시스템의 결과 문장을 교정하여 번역품질을 높이는 연구분야이다. 2015년부터 WMT 공동 캠페인 과제로 선정되었으며 성능 평가는 TER(Translation Error Rate)을 이용한다. 이로 인해 최근 APE에 모델에 대한 다양한 연구들이 발표되고 있으며 이에 본 논문은 APE 분야의 최신 동향에 대해서 다루게 된다.

A Quality Comparison of English Translations of Korean Literature between Human Translation and Post-Editing

  • LEE, IL-JAE
    • International Journal of Advanced Culture Technology
    • /
    • 제6권4호
    • /
    • pp.165-171
    • /
    • 2018
  • As the artificial intelligence (AI) plays a crucial role in machine translation (MT) which has loomed large as a new translation paradigm, concerns have also arisen if MT can produce a quality product as human translation (HT) can. In fact, several MT experimental studies report cases in which the MT product called post-editing (PE) as equally as HT or often superior ([1],[2],[6]). As motivated from those studies on translation quality between HT and PE, this study set up an experimental situation in which Korean literature was translated into English, comparatively, by 3 translators and 3 post-editors. Afterwards, a group of 3 other Koreans checked for accuracy of HT and PE; a group of 3 English native speakers scored for fluency of HT and PE. The findings are (1) HT took the translation time, at least, twice longer than PE. (2) Both HT and PE produced similar error types, and Mistranslation and Omission were the major errors for accuracy and Grammar for fluency. (3) HT turned to be inferior to PE for both accuracy and fluency.

Classification-Based Approach for Hybridizing Statistical and Rule-Based Machine Translation

  • Park, Eun-Jin;Kwon, Oh-Woog;Kim, Kangil;Kim, Young-Kil
    • ETRI Journal
    • /
    • 제37권3호
    • /
    • pp.541-550
    • /
    • 2015
  • In this paper, we propose a classification-based approach for hybridizing statistical machine translation and rulebased machine translation. Both the training dataset used in the learning of our proposed classifier and our feature extraction method affect the hybridization quality. To create one such training dataset, a previous approach used auto-evaluation metrics to determine from a set of component machine translation (MT) systems which gave the more accurate translation (by a comparative method). Once this had been determined, the most accurate translation was then labelled in such a way so as to indicate the MT system from which it came. In this previous approach, when the metric evaluation scores were low, there existed a high level of uncertainty as to which of the component MT systems was actually producing the better translation. To relax such uncertainty or error in classification, we propose an alternative approach to such labeling; that is, a cut-off method. In our experiments, using the aforementioned cut-off method in our proposed classifier, we managed to achieve a translation accuracy of 81.5% - a 5.0% improvement over existing methods.

딥러닝 기반 한국어 맞춤법 교정을 위한 오류 유형 분류 및 분석 (Classification and analysis of error types for deep learning-based Korean spelling correction)

  • 구선민;박찬준;소아람;임희석
    • 한국융합학회논문지
    • /
    • 제12권12호
    • /
    • pp.65-74
    • /
    • 2021
  • 최근 기계 번역 기술과 자동 노이즈 생성 방법론을 기반으로 한국어 맞춤법 교정 연구가 활발히 이루어지고 있다. 해당 방법론들은 노이즈를 생성하여 학습 셋과 데이터 셋으로 사용한다. 이는 학습에 사용된 노이즈 외의 노이즈가 테스트 셋에 포함될 가능성이 낮아 정확한 성능 측정이 어렵다는 한계점이 존재한다. 또한 실제적인 오류 유형 분류 기준이 없어 연구마다 사용하는 오류 유형이 다르므로 질적 분석에 어려움을 겪고 있다. 이를 해결하기 위해 본 논문은 딥러닝 기반 한국어 맞춤법 교정 연구를 위한 새로운 '오류 유형 분류 체계'를 제안하며 이를 바탕으로 기존 상용화 한국어 맞춤법 교정기(시스템 A, 시스템 B, 시스템 C)에 대한 오류 분석을 수행하였다. 분석결과, 세 가지 교정 시스템들이 띄어쓰기 오류 외에 본 논문에서 제시한 다른 오류 유형은 교정을 잘 수행하지 못했으며 어순 오류나 시제 오류의 경우 오류 인식을 거의 하지 못함을 알 수 있었다.

Stereoscopic PIV 기법의 개발과 이를 이용한 축류 홴 후류의 유동해석 (Development of Stereoscopic PIV Measurement Technique and Its Application to Wake behind an Axial Fan)

  • 윤정환;이상준
    • 대한기계학회논문집B
    • /
    • 제26권2호
    • /
    • pp.362-373
    • /
    • 2002
  • A stereoscopic PIV (SPIV) measurement system based on the translation configuration was developed and applied to the flow behind a forward-swept axial-fan. Measurement of three orthogonal velocity components is essential for flow analysis of three-dimensional flows such as flow around a fan or propeller. In this study, the translation configuration was adopted to calculate the out-of-plane velocity component from 2-D PIV data obtained from two CCD cameras. The error caused by the out-of-plane motion was estimated by direct comparison of the 2-D PIV and 3-D SPIV results that measured from the particle images captured simultaneously. The comparison shows that the error ratio is relatively high in the region of higher out-of-plane motion near the axial fan blade. The turbulence intensity measured by the 2-D PIV method is bigger by about 5.8% in maximum compared with that of the 3-D SPIV method. The phase-averaged velocity field results show that the wake behind an axial fan has a periodic flow structure with respect to the blade phase and the characteristic flow structure is shifted downstream in the next phase.

최소 이송 기구를 갖는 PC-NC 기반의 비구면 렌즈 연마 장치에 관한 연구 (A Study on PC-NC Based Aspherical Lens Polishing System with Minimum Translation Mechanism)

  • 양민양;이호철
    • 한국정밀공학회지
    • /
    • 제18권7호
    • /
    • pp.65-71
    • /
    • 2001
  • The development process of the polishing system for the aspherical lens mold for opto-electronics industry is described. The system uses the method that polishing tool is scanned on the surface under PC-NC control for the aspherical lens mold. The two axes interpolation of the minimum translation mechanism is applied to give uniform working condition by motion analysis. An aspherical surface is divided into multiple sections and each dwell time is calculated from the polishing rate model based on the Preston equation. As result of form error compensation experiment, initial form error is decreased about 25% while an average value of surface roughness is also reduced successfully from 180nm to 19nm.

  • PDF

Eigenface를 이용한 얼굴인식에서의 영상등록 오차 보정 (Registration Error Compensation for Face Recognition Using Eigenface)

  • 문지혜;이병욱
    • 한국통신학회논문지
    • /
    • 제30권5C호
    • /
    • pp.364-370
    • /
    • 2005
  • 얼굴 인식에서는 입력 영상에서 얼굴을 검출한 후에 데이터베이스의 영상과 위치와 크기를 일치시키는 등록 과정이 필요하다. 본 논문에서는 영상의 등록 과정에서 발생하는 얼굴영상의 이동, 회전, 혹은 크기의 차이를 eigenspace에서 보정하는 알고리즘을 제안하였다. 이를 위하여 얼굴 영상의 수직, 수평 이동, 회전, 크기 변환 등의 등록오차를 선형보간 행렬로 근사하였다. 각 변환행렬을 사용하여 등록오차에 따른 미분계수를 eigenspace에서 구하면 subpixel 단위의 등록 오차를 보정할 수 있다. 제안된 방법은 공간 영역에서 오차를 보정하는 것보다 계산량이 훨씬 더 적다. 오차 보정 후 얼굴 인식률이 크게 향상되는 것을 실험으로 확인하였다.

DCT와 신경회로망을 이용한 패턴인식에 관한 연구 (A study on pattern recognition using DCT and neural network)

  • 이명길;이주신
    • 한국통신학회논문지
    • /
    • 제22권3호
    • /
    • pp.481-492
    • /
    • 1997
  • This paper presents an algorithm for recognizing surface mount device(SMD) IC pattern based on the error back propoagation(EBP) neural network and discrete cosine transform(DCT). In this approach, we chose such parameters as frequency, angle, translation and amplitude for the shape informantion of SMD IC, which are calculated from the coefficient matrix of DCT. These feature parameters are normalized and then used for the input vector of neural network which is capable of adapting the surroundings such as variation of illumination, arrangement of objects and translation. Learning of EBP neural network is carried out until maximum error of the output layer is less then 0.020 and consequently, after the learning of forty thousand times, the maximum error have got to this value. Experimental results show that the rate of recognition is 100% in case of the random pattern taken at a similar circumstance as well as normalized training pattern. It also show that proposed method is not only relatively relatively simple compare with the traditional space domain method in extracting the feature parameter but also able to re recognize the pattern's class, position, and existence.

  • PDF

직교 좌표에서 카메라 시스템의 방향과 위치 결정 (Determination of Camera System Orientation and Translation in Cartesian Coordinate)

  • 이용중
    • 한국공작기계학회:학술대회논문집
    • /
    • 한국공작기계학회 2000년도 춘계학술대회논문집 - 한국공작기계학회
    • /
    • pp.109-114
    • /
    • 2000
  • A new method for the determination of camera system rotation and translation from in 3-D space using recursive least square method is presented in this paper. With this method, the calculation of the equation is found by a linear algorithm. Where the equation are either given or be obtained by solving five or more point correspondences. Good results can be obtained in the presence if more than the eight point. A main advantage of this new method is that it decouple rotation and translation, and then reduces computation. With respect to error in the solution point number in the input image data, adding one more feature correspondence to required minimum number improves the solution accuracy drastically. However, further increase in the number of feature correspondence improve the solution accuracy only slowly. The algorithm proposed by this paper is used to make camera system rotation and translation easy to recognize even when camera system attached at end effecter of six degrees of freedom industrial robot manipulator are applied industrial field.

  • PDF