• Title/Summary/Keyword: Model Translation

Search Result 474, Processing Time 0.027 seconds

A Survey of Machine Translation and Parts of Speech Tagging for Indian Languages

  • Khedkar, Vijayshri;Shah, Pritesh
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.245-253
    • /
    • 2022
  • Commenced in 1954 by IBM, machine translation has expanded immensely, particularly in this period. Machine translation can be broken into seven main steps namely- token generation, analyzing morphology, lexeme, tagging Part of Speech, chunking, parsing, and disambiguation in words. Morphological analysis plays a major role when translating Indian languages to develop accurate parts of speech taggers and word sense. The paper presents various machine translation methods used by different researchers for Indian languages along with their performance and drawbacks. Further, the paper concentrates on parts of speech (POS) tagging in Marathi dialect using various methods such as rule-based tagging, unigram, bigram, and more. After careful study, it is concluded that for machine translation, parts of speech tagging is a major step. Also, for the Marathi language, the Hidden Markov Model gives the best results for parts of speech tagging with an accuracy of 93% which can be further improved according to the dataset.

A Bidirectional Korean-Japanese Statistical Machine Translation System by Using MOSES (MOSES를 이용한 한/일 양방향 통계기반 자동 번역 시스템)

  • Lee, Kong-Joo;Lee, Song-Wook;Kim, Jee-Eun
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.36 no.5
    • /
    • pp.683-693
    • /
    • 2012
  • Recently, statistical machine translation (SMT) has received many attention with ease of its implementation and maintenance. The goal of our works is to build bidirectional Korean-Japanese SMT system by using MOSES [1] system. We use Korean-Japanese bilingual corpus which is aligned per sentence to train the translation model and use a large raw corpus in each language to train each language model. The proposed system shows results comparable to those of a rule-based machine translation system. Most of errors are caused by noises occurred in each processing stage.

Understanding recurrent neural network for texts using English-Korean corpora

  • Lee, Hagyeong;Song, Jongwoo
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.3
    • /
    • pp.313-326
    • /
    • 2020
  • Deep Learning is the most important key to the development of Artificial Intelligence (AI). There are several distinguishable architectures of neural networks such as MLP, CNN, and RNN. Among them, we try to understand one of the main architectures called Recurrent Neural Network (RNN) that differs from other networks in handling sequential data, including time series and texts. As one of the main tasks recently in Natural Language Processing (NLP), we consider Neural Machine Translation (NMT) using RNNs. We also summarize fundamental structures of the recurrent networks, and some topics of representing natural words to reasonable numeric vectors. We organize topics to understand estimation procedures from representing input source sequences to predict target translated sequences. In addition, we apply multiple translation models with Gated Recurrent Unites (GRUs) in Keras on English-Korean sentences that contain about 26,000 pairwise sequences in total from two different corpora, colloquialism and news. We verified some crucial factors that influence the quality of training. We found that loss decreases with more recurrent dimensions and using bidirectional RNN in the encoder when dealing with short sequences. We also computed BLEU scores which are the main measures of the translation performance, and compared them with the score from Google Translate using the same test sentences. We sum up some difficulties when training a proper translation model as well as dealing with Korean language. The use of Keras in Python for overall tasks from processing raw texts to evaluating the translation model also allows us to include some useful functions and vocabulary libraries as well.

A Defocus Technique based Depth from Lens Translation using Sequential SVD Factorization

  • Kim, Jong-Il;Ahn, Hyun-Sik;Jeong, Gu-Min;Kim, Do-Hyun
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.383-388
    • /
    • 2005
  • Depth recovery in robot vision is an essential problem to infer the three dimensional geometry of scenes from a sequence of the two dimensional images. In the past, many studies have been proposed for the depth estimation such as stereopsis, motion parallax and blurring phenomena. Among cues for depth estimation, depth from lens translation is based on shape from motion by using feature points. This approach is derived from the correspondence of feature points detected in images and performs the depth estimation that uses information on the motion of feature points. The approaches using motion vectors suffer from the occlusion or missing part problem, and the image blur is ignored in the feature point detection. This paper presents a novel approach to the defocus technique based depth from lens translation using sequential SVD factorization. Solving such the problems requires modeling of mutual relationship between the light and optics until reaching the image plane. For this mutuality, we first discuss the optical properties of a camera system, because the image blur varies according to camera parameter settings. The camera system accounts for the camera model integrating a thin lens based camera model to explain the light and optical properties and a perspective projection camera model to explain the depth from lens translation. Then, depth from lens translation is proposed to use the feature points detected in edges of the image blur. The feature points contain the depth information derived from an amount of blur of width. The shape and motion can be estimated from the motion of feature points. This method uses the sequential SVD factorization to represent the orthogonal matrices that are singular value decomposition. Some experiments have been performed with a sequence of real and synthetic images comparing the presented method with the depth from lens translation. Experimental results have demonstrated the validity and shown the applicability of the proposed method to the depth estimation.

  • PDF

A Study on multi-translation system for e-business collaboration (e-비즈니스 협업에 적합한 다중변환 시스템 연구)

  • Ahn, Kyeong-Rim;Chung, Jin-Wook
    • Journal of Internet Computing and Services
    • /
    • v.7 no.6
    • /
    • pp.123-130
    • /
    • 2006
  • The transaction was happened within a single business entity or a single marketplace at the stage of e-business. It becomes to grow to complex form. Expecially, the need for business collaboration between business entities or marketplaces has being on the rise as the core topic. The format translation between documents is very important factor according to various the exchanged document formats. In this paper, we define ebXML as the basic format of exchanged document according to object-oriented business transaction. Also we design the multi-format translation system to support the translation of various document formats. The proposed system in this paper, is designed with model-driven method and it is possible to construct with various structure as for system environment. The proposed translation system is designed to use the proposed system as adding the corresponding parsing module even though any format of document. Also, we increase the reusability of data as using the common data set. In this paper, we prove the superiority of the proposed system to compare the performance with the legacy system for various format translation.

  • PDF

The Applicability Assesment of the Short-term Rainfall Forecasting Using Translation Model (이류모델을 활용한 초단시간 강우예측의 적용성 평가)

  • Yoon, Seong-Sim;Bae, Deg-Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.43 no.8
    • /
    • pp.695-707
    • /
    • 2010
  • The frequency and size of typhoon and local severe rainfall are increasing due to the climate change and the damage also increasing from typhoon and severe rainfall. The flood forecasting and warning system to reduce the damage from typhoon and severe rainfall needs forecasted rainfall using radar data and short-term rainfall forecasting model. For this reason, this study examined the applicability of short-term rainfall forecast using translation model with weather radar data to point out that the utilization of flood forecasting in Korea. This study estimated the radar rainfall using Least-square fitting method and estimated rainfall was used as initial field of translation model. The translation model have verified accuracy of forecasted radar rainfall through the comparison of forecasted radar rainfall and observed rainfall quantitatively and qualitatively. Almost case studies showed that accuracy is over 0.6 within 4 hours leading time and mean of correlation coefficient is over 0.5 within 1 hours leading time in Kwanak and Jindo radar site. And, as the increasing the leading time, the forecast accuracy of precipitation decreased. The results of the calculated Mean Area Precipitation (MAP) showed forecast rainfall tend to be underestimated than observed rainfall but the correlation coefficient more than 0.5. Therefore it showed that translation model could be accurately predicted the rainfall relatively. The present results indicate that possibility of translation model application of Korea just within 2 hours leading forecasted rainfall.

A Principle-based Korean / Japanese Machine Translation System : NARA (원리에 따른 한 / 일 기계번역 시스팀 : NARA)

  • Jeong, Hui-Seong
    • ETRI Journal
    • /
    • v.10 no.3
    • /
    • pp.140-156
    • /
    • 1988
  • This paper presents methodological and theoretical principles for constructing a machine thanslation system between Korean and Japanese. We focus our discussion on the real time computing problem of the machine translation system. This problem is characterized in the time and space complexity during the machine translation. The NARA system has the real time computing algorithm which is based on a mathematical model integrating the linguistic competence and the linguistic performance of both languages, with consequence that the system NARA has also the functional characteristic : the two-way translation mechanism.

  • PDF

A Web-based Translation Service with Collective Intelligence (집단지성 웹기반 번역서비스)

  • Lee, Soong-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.12
    • /
    • pp.2997-3004
    • /
    • 2014
  • The legacy on-line translation service limits the participation to general users except clients and translators while the automatic translation service guarantees no exactness and perfectness on the translated results. This paper proposes a web-based translation service with business model that permits the participation of general users as well as translators and clients to modify and evaluate the transient contents of the translation ultimately leading to the collective intelligence.

The Method of Color Image Processing Using Adaptive Saturation Enhancement Algorithm (적응형 채도 향상 알고리즘을 이용한 컬러 영상 처리 기법)

  • Yang, Kyoung-Ok;Yun, Jong-Ho;Cho, Hwa-Hyun;Choi, Myung-Ryul
    • The KIPS Transactions:PartB
    • /
    • v.14B no.3 s.113
    • /
    • pp.145-152
    • /
    • 2007
  • In this paper, we propose an automatic extraction model for unknown translations and implement an unknown translation extraction system using the proposed model. The proposed model as a phrase-alignment model is incorporated with three models: a phrase-boundary model, a language model, and a translation model. Using the proposed model we implement the system for extracting unknown translations, which consists of three parts: construction of parallel corpora, alignment of Korean and English words, extraction of unknown translations. To evaluate the performance of the proposed system, we have established the reference corpus for extracting unknown translation, which comprises of 2,220 parallel sentences including about 1,500 unknown translations. Through several experiments, we have observed that the proposed model is very useful for extracting unknown translations. In the future, researches on objective evaluation and establishment of parallel corpora with good quality should be performed and studies on improving the performance of unknown translation extraction should be kept up.

Question Classification Based on Word Association for Question and Answer Archives (질문대답 아카이브에서 어휘 연관성을 이용한 질문 분류)

  • Jin, Xueying;Lee, Kyung-Soon
    • The KIPS Transactions:PartB
    • /
    • v.17B no.4
    • /
    • pp.327-332
    • /
    • 2010
  • Word mismatch is the most significant problem that causes low performance in question classification, whose questions consist of only two or three words that expressed in many different ways. So, it is necessary to apply word association in question classification. In this paper, we propose question classification method using translation-based language model, which use word translation probabilities for question-question pair that is learned in the same category. In the experiment, we prove that translation probabilities of question-question pairs in the same category is more effective than question-answer pairs in total collection.