• Title/Summary/Keyword: Machine Translation System

Search Result 169, Processing Time 0.024 seconds

Generation of Korean Predicates for Japanese-Korean Machine Translation System and its Evaluation (일-한 기계 번역에 있어서 한국어 술부의 생성과 평가)

  • Kim, Jung-In;Moon, Kyong-Hi;Lee, Jong-Hyeok;Lee, Geun-Bae
    • Annual Conference on Human and Language Technology
    • /
    • 1996.10a
    • /
    • pp.329-337
    • /
    • 1996
  • 일-한 기계 번역을 연구하는 많은 연구자들은 양국어의 문절-어절 단위의 어순 일치와 같은 구조적 유사성을 최대한 이용하기 위해 직접 번역 방식을 채택하고 있다. 그러나, 일본어와 한국어 술부간에는 대응하는 품사의 불일치 및 국부적인 어순의 불일치 등이 어려운 문제로 남아 있다. 본 논문에서는 이들 술부 표현의 불일치를 해결하기 위해 이미 제안하였던 "양상 테이블을 기반으로 한 한국어 술부의 생성 방법"에 대해 좀더 체계적인 평가를 하고자 한다. 이 방법은 술부만을 대상으로 하는 추상적이고 의미 기호적인 양상 자질(modality feature)을 테이블화(양상 테이블)하여, 양국어의 술부 표현의 피봇(pivot)으로 이용함으로써 술부 양상 표현의 효과적인 번역을 가능하게 하였다. 일본어 499 문을 대상으로 실제 술부의 번역처리를 시행해 본 결과, 약 97.7%가 자연스럽게 번역됨을 확인하였다. 특히, 술부의 생성 부분은 일본어에 의존하지 않는 양상 테이블을 도입함으로써 일-한뿐만 아니라 다른 언어로부터의 한국어 술부 생성에도 적용시킬 수 있을 것이다.

  • PDF

Protocol Conformance Testing of INAP Protocol in SDL (SDL을 사용한 INAP 프로토콜 시험)

  • 도현숙;조준모;김성운
    • Journal of Korea Multimedia Society
    • /
    • v.1 no.1
    • /
    • pp.109-119
    • /
    • 1998
  • This paper describes a research result on automatic generation of Abstract Test Suite from INAP protocol in formal specifications by applying many existing related algorithms such as Rural Chinese Postman Tour and UIO sequence concepts. We use the I/O FSM generated from SDL specifications and a characterizing sequence concepts. We use the I/O FSM generated from SDL specifications and a characterizing sequence, called UIO sequence, is defined for the I/O FSM. The UIO sequence is combined with the concept of Rural Chinese Postman tour to obtain an optimal test sequence. It also proposes an estimation methodology of the fault courage for the Test Suite obtained by our method and their translation into the standardized test notation TTCN.

  • PDF

Korean-to-English Machine Translation System based on Verb-Phrase : 'CaptionEye/KE' (용언구에 기반한 한영 기계번역 시스템 : 'CaptionEye/KE')

  • Seo, Young-Ae;Kim, Young-Kil;Seo, Kwang-Jun;Choi, Sung-Kwon
    • Annual Conference of KIPS
    • /
    • 2000.10a
    • /
    • pp.269-272
    • /
    • 2000
  • 본 논문에서는 ETRI에서 개발 중인 용언구에 기반한 한영 기계번역 시스템 CaptionEye/KE에 대하여 논술한다. CaptionEye/KE는 대량의 고품질 한-영 양방향 코퍼스로부터 추출된 격틀사전 및 대역패턴, 대역문 연결패턴 등의 언어 지식들을 바탕으로 하여, 한국어의 용언구 단위의 번역을 조합하여 전체 번역을 수행한다. CaptionEye/KE는 변환방식의 기계번역 시스템으로서, 크게 한국어 형태소 분석기, 한국어 구문 분석기, 부분 대역문 연결기, 부분 대역문 생성기, 대역문 선택/정련기, 영어형태소 생성기로 구성된다. 입력된 한국어 문장에 대해 형태소 분석 및 태깅을 수행한 후, 격틀사전을 이용하여 구문구조를 분석하고 의존 트리를 생성해 낸다. 이렇게 생성된 의존 트리로부터 대역문 연결패턴을 이용하여 용언구들간의 연결에 대한 번역을 수행한 후 대역패턴을 이용하여 각 용언구들을 번역하고 문장 정련과정을 거쳐 영어 문장을 최종 생성한다.

  • PDF

Spam-mail Filtering based on Lexical Information and Thesaurus (어휘정보와 시소러스에 기반한 스팸메일 필터링)

  • Kang Shin-Jae;Kim Jong-Wan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.11 no.1
    • /
    • pp.13-20
    • /
    • 2006
  • In this paper, we constructed a spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mil. The definite information is the mail sender's information, URL, a certain spam keyword list, and the less definite information is the word lists and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

  • PDF

Phrase-Pattern-based Korean-to-English Machine Translation System using Two Level Word Selection (두단계 대역어선택 방식을 이용한 구단위 패턴기반 한영 기계번역 시스템)

  • Kim, Jung-Jae;Park, Jun-Sik;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 1999.10e
    • /
    • pp.209-214
    • /
    • 1999
  • 패턴기반기계번역방식은 원시언어패턴과 그에 대한 대역언어패턴들의 쌍을 이용하여 구문분석과 변환을 수행하는 기계번역방식이다. 패턴기반 기계번역방식은 번역할 때 발생하는 애매성을 해소하기 위해 패턴의 길이를 문장단위까지 늘이기 때문에, 패턴의 수가 급증하는 문제점을 가진다. 본 논문에서는 패턴의 단위를 구단위로 한정시킬 때 발생하는 애매성을 해소하는 방법으로 시소러스를 기반으로 한 두단계 대역어 선택 방식을 제안함으로써 효과적으로 애매성을 감소시키면서 패턴의 길이를 줄이는 모델을 제시한다. 두단계 대역어 선택 방식은 원시언어의 한 패턴에 대해 여러 가능한 목적언어의 대역패턴들이 있을 때, 첫 번째 단계에서는 원시언어 내에서의 제약조건에 맞는 몇가지 대역패턴들을 선택하고, 두번째 단계에서는 목적언어 내에서의 제약조건에 가장 적합한 하나의 대역패턴을 선택하는 방식이다. 또한 본 논문에서는 이와 같은 모델에서 패턴의 수가 코퍼스의 증가에 따른 수렴가능성을 논한다.

  • PDF

Building an Annotated English-Vietnamese Parallel Corpus for Training Vietnamese-related NLPs

  • Dien Dinh;Kiem Hoang
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.103-109
    • /
    • 2004
  • In NLP (Natural Language Processing) tasks, the highest difficulty which computers had to face with, is the built-in ambiguity of Natural Languages. To disambiguate it, formerly, they based on human-devised rules. Building such a complete rule-set is time-consuming and labor-intensive task whilst it doesn't cover all the cases. Besides, when the scale of system increases, it is very difficult to control that rule-set. So, recently, many NLP tasks have changed from rule-based approaches into corpus-based approaches with large annotated corpora. Corpus-based NLP tasks for such popular languages as English, French, etc. have been well studied with satisfactory achievements. In contrast, corpus-based NLP tasks for Vietnamese are at a deadlock due to absence of annotated training data. Furthermore, hand-annotation of even reasonably well-determined features such as part-of-speech (POS) tags has proved to be labor intensive and costly. In this paper, we present our building an annotated English-Vietnamese parallel aligned corpus named EVC to train for Vietnamese-related NLP tasks such as Word Segmentation, POS-tagger, Word Order transfer, Word Sense Disambiguation, English-to-Vietnamese Machine Translation, etc.

  • PDF

Deep Learning in Radiation Oncology

  • Cheon, Wonjoong;Kim, Haksoo;Kim, Jinsung
    • Progress in Medical Physics
    • /
    • v.31 no.3
    • /
    • pp.111-123
    • /
    • 2020
  • Deep learning (DL) is a subset of machine learning and artificial intelligence that has a deep neural network with a structure similar to the human neural system and has been trained using big data. DL narrows the gap between data acquisition and meaningful interpretation without explicit programming. It has so far outperformed most classification and regression methods and can automatically learn data representations for specific tasks. The application areas of DL in radiation oncology include classification, semantic segmentation, object detection, image translation and generation, and image captioning. This article tries to understand what is the potential role of DL and what can be more achieved by utilizing it in radiation oncology. With the advances in DL, various studies contributing to the development of radiation oncology were investigated comprehensively. In this article, the radiation treatment process was divided into six consecutive stages as follows: patient assessment, simulation, target and organs-at-risk segmentation, treatment planning, quality assurance, and beam delivery in terms of workflow. Studies using DL were classified and organized according to each radiation treatment process. State-of-the-art studies were identified, and the clinical utilities of those researches were examined. The DL model could provide faster and more accurate solutions to problems faced by oncologists. While the effect of a data-driven approach on improving the quality of care for cancer patients is evidently clear, implementing these methods will require cultural changes at both the professional and institutional levels. We believe this paper will serve as a guide for both clinicians and medical physicists on issues that need to be addressed in time.

An Automatic Extraction of English-Korean Bilingual Terms by Using Word-level Presumptive Alignment (단어 단위의 추정 정렬을 통한 영-한 대역어의 자동 추출)

  • Lee, Kong Joo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.6
    • /
    • pp.433-442
    • /
    • 2013
  • A set of bilingual terms is one of the most important factors in building language-related applications such as a machine translation system and a cross-lingual information system. In this paper, we introduce a new approach that automatically extracts candidates of English-Korean bilingual terms by using a bilingual parallel corpus and a basic English-Korean lexicon. This approach can be useful even though the size of the parallel corpus is small. A sentence alignment is achieved first for the document-level parallel corpus. We can align words between a pair of aligned sentences by referencing a basic bilingual lexicon. For unaligned words between a pair of aligned sentences, several assumptions are applied in order to align bilingual term candidates of two languages. A location of a sentence, a relation between words, and linguistic information between two languages are examples of the assumptions. An experimental result shows approximately 71.7% accuracy for the English-Korean bilingual term candidates which are automatically extracted from 1,000 bilingual parallel corpus.

Optical Error Analysis and Compensation of Six Degrees of Freedom Measurement System Using a Diffraction Grating Target (회절 격자 표식을 이용한 6자유도 측정 시스템의 광학적 오차 해석 및 보상)

  • Kim, Jong-Ahn;Bae, Eui-Won;Kim, Soo-Hyun;Kwak, Yoon-Keun
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.18 no.2
    • /
    • pp.152-160
    • /
    • 2001
  • Six degrees of freedom measurement systems are required in many fields: Precision machine control. precision assembly, vibration analysis, and so on. This paper presents a new six degrees of freedom measurement system utilizing typical features of a diffraction grating. It is composed of a laser source, three position sensitive detectors, a diffraction grating target, and several optical components. Six degrees of freedom displacement is calculated kinematically from the coordinates of diffracted rays on the detectors. Optical measurement error was caused by the fact that a laser source had a Gaussian intensity distribution. This error was analyzed and compensated using simple equations. The performance of the compensation equation was verified in the experiment. The experimental results showed that the compensation equation could reduce the optical measurement error remarkably and the error in six degrees of freedom measurement less than $\pm$10$\mu$m for translation and $\pm$0.012$^{\circ}$for rotation.

  • PDF

Gaze Detection by Computing Facial and Eye Movement (얼굴 및 눈동자 움직임에 의한 시선 위치 추적)

  • 박강령
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.2
    • /
    • pp.79-88
    • /
    • 2004
  • Gaze detection is to locate the position on a monitor screen where a user is looking by computer vision. Gaze detection systems have numerous fields of application. They are applicable to the man-machine interface for helping the handicapped to use computers and the view control in three dimensional simulation programs. In our work, we implement it with a computer vision system setting a IR-LED based single camera. To detect the gaze position, we locate facial features, which is effectively performed with IR-LED based camera and SVM(Support Vector Machine). When a user gazes at a position of monitor, we can compute the 3D positions of those features based on 3D rotation and translation estimation and affine transform. Finally, the gaze position by the facial movements is computed from the normal vector of the plane determined by those computed 3D positions of features. In addition, we use a trained neural network to detect the gaze position by eye's movement. As experimental results, we can obtain the facial and eye gaze position on a monitor and the gaze position accuracy between the computed positions and the real ones is about 4.8 cm of RMS error.