• Title/Summary/Keyword: target language

Search Result 474, Processing Time 0.023 seconds

Resolving Multi-Translatable Verbs Japanese-TO-Korean Machine Translation

  • Kim Jung-In;Lee Kang-Hyuk
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.6
    • /
    • pp.790-797
    • /
    • 2005
  • It is well-known that there are many similarities between Japanese and Korean language. For example, the order of words and the nature of the grammatical conjugation of both languages are almost the same. Another similarity is the frequent omission of the subject from a sentence. Moreover, both languages have honorific expressions and the identical concept for expressing nouns in terms of Chinese characters. Using these similarities, we have developed a word-to-word translation system which does away with any deep level analysis of syntactic and semantic structures of the two languages. If we use these similarities, the direct translation method is superior to the internal language translation method or transfer-based translation method. Although the MT system based on the direct translation method is more easily developed than the ones based on other methods, it may have a lot of difficulties when it tries to select the appropriate target word from ambiguous source verbs. In this paper, we propose a new algorithm to extract the meaning of substantives and to make use of the order of the extracted meaning. We could select $86.5\%$ appropriate verbs in the sample sentences from IPAL-verb-dictionary. $13.5\%$ indicates the cases in which we could not distinguish the meaning of substantives. We are convinced, however, that the succeeding rate can be increased by getting rid of the meaning of verbs thatare not used so often.

  • PDF

A Study on the Process of the Development of Le Corbusier's Villas - Focused on the Comparison between the Villas in 1920s - (르 꼬르뷔제의 주거건축 발전과정에 대한 연구 - 1920년대 주택작품을 중심으로 -)

  • Do, Hyun-Hak
    • Journal of architectural history
    • /
    • v.19 no.2
    • /
    • pp.133-152
    • /
    • 2010
  • Through the 20th century architectural theory of rational thinking and the principles according to the statement of the firm, Le Corbusier's early works by analyzing the language of architecture, including his architectural concepts and helps the understanding of modern architecture. Masters of modern architecture as a place among the works of Le Corbusier in the 1920s, 'Le Corbusier's White Period' homes to target by age, and that his initial concept of the formation process of clarifying, further the construction of his golden age will be the key to understand. I selected housings from Maison Citrohan to villa Savoye, the category of architectural elements by considering the development process, he examines developments of the concept of modern architectural space. In analysis process, First of all, I am focused on between each element in the development process to review and explicate the integration process. The advent of the machine which he lived, because of its incredible potential reorganization of human society, as well as to celebrate the start of a new lifestyle. His architectural language of the times in the machinism could achieve progress and modern art, which is a new interpretation of the natural world.

An Acoustical Analysis of English Stops at the Initial and After-initial-/s/ Positions by Korean and American Speakers (한국인과 미국인의 초성 및 초성 /s/ 다음에 오는 영어 파열음 음향 분석)

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.5 no.3
    • /
    • pp.11-20
    • /
    • 2013
  • The purpose of this study is to compare the acoustic parameters of English stop consonants at the initial and after-initial-/s/ positions in a message produced by 47 Korean and American speakers in order to provide better pronunciation skills of English stops for Korean learners. A Praat script was developed to obtain voice onset time (VOT), maximum consonant intensity (maxCi), and rate of rise (ROR) from six target words with stops at the positions in the message. Results show that VOT and maxCi were significantly different between the two language groups while ROR wasn't. The Korean speakers generally produced the stop consonants with longer VOTs and higher consonant intensity. From the comparison of consonant groups at the two different positions, the Korean participants did not distinguish them as clearly as the American participants did at the after-initial-/s/ position. Finally a comparison of each language and sex group revealed that the major difference was attributed to stop consonants in the after-/s/ position. The author concluded that Korean speakers should be careful not to produce all the stops with longer VOTs and higher intensity. Further studies would be desirable to examine how Americans evaluate Korean speakers' English proficiency with modified acoustic values of English stops.

Hybrid Approach to Sentiment Analysis based on Syntactic Analysis and Machine Learning (구문분석과 기계학습 기반 하이브리드 텍스트 논조 자동분석)

  • Hong, Mun-Pyo;Shin, Mi-Young;Park, Shin-Hye;Lee, Hyung-Min
    • Language and Information
    • /
    • v.14 no.2
    • /
    • pp.159-181
    • /
    • 2010
  • This paper presents a hybrid approach to the sentiment analysis of online texts. The sentiment of a text refers to the feelings that the author of a text has towards a certain topic. Many existing approaches employ either a pattern-based approach or a machine learning based approach. The former shows relatively high precision in classifying the sentiments, but suffers from the data sparseness problem, i.e. the lack of patterns. The latter approach shows relatively lower precision, but 100% recall. The approach presented in the current work adopts the merits of both approaches. It combines the pattern-based approach with the machine learning based approach, so that the relatively high precision and high recall can be maintained. Our experiment shows that the hybrid approach improves the F-measure score for more than 50% in comparison with the pattern-based approach and for around 1% comparing with the machine learning based approach. The numerical improvement from the machine learning based approach might not seem to be quite encouraging, but the fact that in the current approach not only the sentiment or the polarity information of sentences but also the additional information such as target of sentiments can be classified makes the current approach promising.

  • PDF

Understanding recurrent neural network for texts using English-Korean corpora

  • Lee, Hagyeong;Song, Jongwoo
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.313-326
    • /
    • 2020
  • Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.

Implementation of Nondeterministic Compiler Using Monad (모나드를 이용한 비결정적 컴파일러 구현)

  • Byun, Sugwoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.2
    • /
    • pp.151-159
    • /
    • 2014
  • We discuss the implementation of a compiler for an imperative programming language, using monad in Haskell. This compiler involves a recursive-descent parser conducting nondeterministic parsing, in which backtracking occurs to try with other rules when the application of a production rule fails to parse an input string. Haskell has some strong facilities for parsing. Its algebraic types represent abstract syntax trees in a smooth way, and program codes by monad parsing are so concise that they are highly readable and code size is reduced significantly, comparing with other languages. We also deal with the runtime environment of the assembler and code generation whose target is the Stack-Assembly language based on a stack machine.

Extension and Management of Verb Phrase Patterns based on Lexicon Reconstruction and Target Word Information (사전 재구성과 대역어 정보를 통한 동사구 패턴의 확장 및 관리)

  • Hong, Mun-Pyo;Kim, Young-Kil;Ryu, Chul;Choi, Sung-Kwon;Park, Sang-Kyu
    • Annual Conference on Human and Language Technology
    • /
    • 2002.10e
    • /
    • pp.103-107
    • /
    • 2002
  • 데이터 기반 기계번역의 성공여부는 대량의 데이터를 단기간에 구축하는 방법과, 또 구축된 데이터에 대한 효과적인 관리 방법이 좌우한다고 할 수 있다. 대표적인 데이터 기반 기계번역 방법론인 예제 기반 기계번역 방식이나 패턴 기반 기계번역 방식에서는 최소한의 학습 내지는 학습과정 없이 데이터를 구축하는 데에 연구가 중점적으로 이루어져왔으나, 데이터의 관리 문제에 대해서는 많은 연구가 이루어지지 못하였다. 그러나 데이터의 확장 못지않게 데이터의 효율적인 관리도 데이터 기반 기계번역 시스템의 개발에서 매우 중요하다. 이 논문에서는 사/피동 링크 등을 이용하여 사전을 재구성하는 것이 데이터의 일관성과 관리성을 향상시키고, 이론적인 면에서는 정보 기술상의 잉여성을 줄인다는 점을 보인다. 또한 이러한 정보에 기반하여 기구축된 동사구 패턴으로부터 대역어 정보를 이용하여 새로운 패턴을 만들어내는 방법론도 제시한다.

  • PDF

Character-Level Neural Machine Translation (문자 단위의 Neural Machine Translation)

  • Lee, Changki;Kim, Junseok;Lee, Hyoung-Gyu;Lee, Jaesong
    • Annual Conference on Human and Language Technology
    • /
    • 2015.10a
    • /
    • pp.115-118
    • /
    • 2015
  • Neural Machine Translation (NMT) 모델은 단일 신경망 구조만을 사용하는 End-to-end 방식의 기계번역 모델로, 기존의 Statistical Machine Translation (SMT) 모델에 비해서 높은 성능을 보이고, Feature Engineering이 필요 없으며, 번역 모델 및 언어 모델의 역할을 단일 신경망에서 수행하여 디코더의 구조가 간단하다는 장점이 있다. 그러나 NMT 모델은 출력 언어 사전(Target Vocabulary)의 크기에 비례해서 학습 및 디코딩의 속도가 느려지기 때문에 출력 언어 사전의 크기에 제한을 갖는다는 단점이 있다. 본 논문에서는 NMT 모델의 출력 언어 사전의 크기 제한 문제를 해결하기 위해서, 입력 언어는 단어 단위로 읽고(Encoding) 출력 언어를 문자(Character) 단위로 생성(Decoding)하는 방법을 제안한다. 출력 언어를 문자 단위로 생성하게 되면 NMT 모델의 출력 언어 사전에 모든 문자를 포함할 수 있게 되어 출력 언어의 Out-of-vocabulary(OOV) 문제가 사라지고 출력 언어의 사전 크기가 줄어들어 학습 및 디코딩 속도가 빨라지게 된다. 실험 결과, 본 논문에서 제안한 방법이 영어-일본어 및 한국어-일본어 기계번역에서 기존의 단어 단위의 NMT 모델보다 우수한 성능을 보였다.

  • PDF

A Study on the IDL Compiler using the Marshal Buffer Management

  • Kim, Dong-Hyun
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.1
    • /
    • pp.843-847
    • /
    • 2005
  • The development of distributed application in the standardized CORBA(Common Object Request Broker Architecture) environments reduces the developing time and maintaining cost of the systems. Because of these advantages, the development of application is being progressed in the several fields using the CORBA environments. The programmers in the CORBA environments usually develop the application programs using the CORBA IDL(Interface Definition Language). The IDL files are compiled by IDL compiler and translated into the stubs and skeleton codes which are mapped onto particular target language. The stubs produced by IDL compilers processes the marshaling a data into message buffer. Before a stub can marshal a data into its message buffer, the stub must ensure that the buffer has at least enough free space to contain the encoded representation of the data. But, the stubs produced by typical IDL compilers check the amount of free buffer space before every atomic data is marshaled, and if necessary, expand the message buffer. These repeated tests are wasteful and incidence of overheads, especially if the marshal buffer space must be continually expanded. Thus, the performance of the application program may be poor. In this paper, we suggest the way that the stub code is maintain the enough free space before marshaling the data into message buffer. This methods were analyzes the overall storage requirements of every message that will be exchanged between client and server. For these analysis, in the Front End of compiler has maintain the information that the storage requirements and alignment constraints for data types. Thus, stub code is optimized and the performance of application program is increased.

  • PDF

Automated Testing Techniques for Automotive Software Components with TTCN-3 (TTCN-3을 이용한 차량 소프트웨어 컴포넌트의 테스팅 자동화 방법)

  • Kum, Dae-Hyun;Lee, Seong-Hun;Park, Gwang-Min;Cho, Jeong-Hun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.5
    • /
    • pp.541-545
    • /
    • 2010
  • AUTOSAR, a standard software platform for automotive, has been developed to manage software complexity and improve software reuseability. However reuse of test system is difficult because it is dependant on implementation language and test phase. In this paper, we suggest a test system generation method for AUTOSAR software component using TTCN-3, a standardized testing language. TTCN-3 test system is generated automatically from AUTOSAR XML containing software design information. The test system consists of TTCN-3 tester and target system and tests functionality and worst case response time of software under simulation environment. With the proposed testing techniques we can reduce time and effort to build the testing system and reuse testing environment.