• 제목/요약/키워드: text translation

검색결과 147건 처리시간 0.023초

T-EBOW를 이용한 취업알선 챗봇용 단문 분류 연구 (Short Text Classification for Job Placement Chatbot by T-EBOW)

  • 김정래;김한준;정경희
    • 인터넷정보학회논문지
    • /
    • 제20권2호
    • /
    • pp.93-100
    • /
    • 2019
  • 최근 각종 사업 분야에서 기업들은 기존 메신저 플랫폼에 인공지능을 더하여 다양한 환경을 대상으로 챗봇 서비스 지원에 주력하고 있다. 취업알선 분야의 기관에서도 취업상담 서비스 품질 제고와 상담 인력 해소를 위해 챗봇 서비스를 요구한다. 일반적인 텍스트 기반 챗봇은 입력된 사용자 문장을 학습된 문장으로 분류하여 적합한 답변을 사용자에게 제공한다. 최근 소셜 네트워크 서비스의 활성화 영향으로 챗봇에 입력되는 사용자 문장은 단문으로 입력되는 경향이 있다. 따라서 단문 분류의 성능향상은 챗봇 서비스의 성능향상에 기여할 수 있다. 본 연구는 취업알선 챗봇을 위한 단문 분류 강화를 위해 기존 연구의 개념 정보뿐만 아니라 번역문 정보를 활용하는 방법인 T-EBOW (Translation-Extended Bag Of Words)를 제안한다. T-EBOW를 기계학습 분류 모델에 적용한 단문 분류의 성능은 기존 방법에 비해 우수한 성능 평가 결과를 보였다.

한의학 분야 고문헌 번역연구 현황 - 시대 및 분야별 분석을 중심으로 - (Current Status of Translation Research on Korean Medical Classics - Focusing on Analysis by Era and Field -)

  • 김상현
    • 대한한의학원전학회지
    • /
    • 제35권3호
    • /
    • pp.1-20
    • /
    • 2022
  • Objectives : Translations of Korean Medical Classical texts were analyzed quantitatively to verify their trend. Based on findings, accumulated problems and their solutions were discussed. Methods : A list of translated Classical texts in the field of Korean Medicine from the National Central Library collection was organized. Afterwards, the publication date, field, author information and content of the translated version were analyzed. Results : Of Chinese Medical texts, those from the Ming and Qing periods were most translated, while major texts pre-dating the Song period were left out. In addition, while texts in the fields of Shanghan-Jingui, comprehensive medical texts, scriptures, medical theories that were high in demand in educational and clinical sectors were actively translated, those in secondary fields were insufficiently translated. Of medical texts of Korea, those from the Joseon period were mostly translated, including major texts such as the Donguibogam and various kinds of texts reflecting research demands. Conclusions : In the future, texts that have not been translated need to be prioritized while basic elements need to be identified for better quality translation. To enable quantitative and qualitative expansion of Korean Medical Classical Texts translation, institutional and academic support is crucial.

형태소 분석 기반 전자책 수화 번역 프로그램 (E-book to sign-language translation program based on morpheme analysis)

  • 한솔이;김세아;황경호
    • 한국정보통신학회논문지
    • /
    • 제21권2호
    • /
    • pp.461-467
    • /
    • 2017
  • 인터넷의 발전과 스마트 디바이스의 확산으로 e-book에 대한 수요가 늘고 있다. 그러나 청각 손실로 정확한 언어를 배우기 어려운 청각장애인은 텍스트로만 이루어진 e-book 서비스를 사용하기 어렵다. 본 논문에서는 e-book의 문장을 읽어 수화 동영상으로 제공하는 안드로이드 기반 애플리케이션 프로그램을 설계하고 구현하였다. e-book의 한국어 문장을 수화 언어로 번역하기 위해 형태소 분석에 기반한 알고리즘을 사용하였다. 제안한 알고리즘은 3단계로 구성된다. 1단계는 수화 표현을 위한 문장 요소 제거, 2단계는 수화 표현의 변환 및 시제 표현, 3단계는 수화 높임말 용어 변경 및 위치 이동이다. 또한 수화 번역 품질에 대한 평가 방안을 제시하고 100개의 기준 문장에 대해 제안한 알고리즘을 통한 번역 결과의 우수성을 확인하였다.

Assessment of performance of machine learning based similarities calculated for different English translations of Holy Quran

  • Al Ghamdi, Norah Mohammad;Khan, Muhammad Badruddin
    • International Journal of Computer Science & Network Security
    • /
    • 제22권4호
    • /
    • pp.111-118
    • /
    • 2022
  • This research article presents the work that is related to the application of different machine learning based similarity techniques on religious text for identifying similarities and differences among its various translations. The dataset includes 10 different English translations of verses (Arabic: Ayah) of two Surahs (chapters) namely, Al-Humazah and An-Nasr. The quantitative similarity values for different translations for the same verse were calculated by using the cosine similarity and semantic similarity. The corpus went through two series of experiments: before pre-processing and after pre-processing. In order to determine the performance of machine learning based similarities, human annotated similarities between translations of two Surahs (chapters) namely Al-Humazah and An-Nasr were recorded to construct the ground truth. The average difference between the human annotated similarity and the cosine similarity for Surah (chapter) Al-Humazah was found to be 1.38 per verse (ayah) per pair of translation. After pre-processing, the average difference increased to 2.24. Moreover, the average difference between human annotated similarity and semantic similarity for Surah (chapter) Al-Humazah was found to be 0.09 per verse (Ayah) per pair of translation. After pre-processing, it increased to 0.78. For the Surah (chapter) An-Nasr, before preprocessing, the average difference between human annotated similarity and cosine similarity was found to be 1.93 per verse (Ayah), per pair of translation. And. After pre-processing, the average difference further increased to 2.47. The average difference between the human annotated similarity and the semantic similarity for Surah An-Nasr before preprocessing was found to be 0.93 and after pre-processing, it was reduced to 0.87 per verse (ayah) per pair of translation. The results showed that as expected, the semantic similarity was proven to be better measurement indicator for calculation of the word meaning.

중국인 학습자를 위한 문화교육으로서 한·중 소설 비교읽기 -4.19와 문화대혁명을 중심으로-

  • 전영의;엄영욱
    • 중국학논총
    • /
    • 제62호
    • /
    • pp.85-100
    • /
    • 2019
  • The article purpose is 'Reading Chinese translation text as a Korean integrated education for Chinese students'. Although number of foreign students has increased rapidly to the economic growth of Korea, the influence of Korean Wave, and the popularity of Korean popular culture like K-pop at domestic universities but the problems of their curriculum have been found in many places. Korean literary education through novel text has an important place in Korean studies, but literary education is often excluded in Korean language education as a foreign language education. Chinese students already have background knowledge of Korean translation novels through Chinese novels. They can get the learning effect as the Korean language study. Second, they can compared with Korean national violence and Chinese national violence through 'Red Revolution' and understand about Korean-Chinese understanding of the times, social and cultural phenomena, Third, they are able to study the theory of literature itself. also It was the educational purpose pursued by the humanities. Chinese students develop their Korean language skills by studying the Brothers which are translated into Korean, and we can see the similarities and differences of national violence by comparing Korea's '4.19' with China's 'Cultural Revolution' After comparing people, background, dynamics of the space where they are located, we can raise awareness of the historical and social problems of both countries. It is possible to study subjects' memories of space, change of local meaning, the formation of urban space or individual space in the text in the specific space where national violence occurs. In this way, the method of learning Korean integrated education through Brothers of the Chinese translation novels makes an opportunity to look at national violence in the Korean-Chinese space of the 1960s and 1970s. It has a subjective perspective from subordination to the nationality of the modern nation-state. This is an educational effect that can be obtained through reading a Chinese translation novel as a Korean language integrated education.

웹 영상에 포함된 문자 영역의 추출 (Text Extraction In WWW Images)

  • 김상현;심재창;김중수
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 하계종합학술대회 논문집(4)
    • /
    • pp.15-18
    • /
    • 2000
  • In this paper, we propose a method for text extraction in the Web images. Our approach is based on contrast detecting and pixel component ratio analysis in mouse position. Extracted data with OCR can be used for real time dictionary call or language translation application in Web browser.

  • PDF

텍스트 맥락과 중한번역

  • 박은숙
    • 중국학논총
    • /
    • 제70호
    • /
    • pp.61-86
    • /
    • 2021
  • 本文主要討論了語境在中韓語篇翻譯中的作用。作者把韓礼德的語境三分法, 應用于中韓語篇翻譯實踐中。概括起來設, 上下文語境(又称爲語言語境)是指一个詞, 一个短語, 乃至更長的語篇前后的内容。情景語境就是語域變量, 語域變量可分爲以下三种 : 語場, 語旨和語式, 最后, 文化語境指語篇所涉及的社會, 文化, 經濟, 宗教和政治背景等。作者把語境分爲上下文語境, 情景語境和文化語境, 深入探討了中韓翻譯中的語境問題。作者把文化語境, 還分爲文化詞和文化含義詞的影響与制約和文化詞的翻譯戰略二部分, 論述了文化詞翻譯的難点以及文化詞的翻譯技巧。通過語境分析我們可以看出, 在中韓翻譯實踐中利用語境因素能排除歧義 ; 借助語境中特定的情境意義在譯文中重构原文中用語法, 語用和語体等表現出的意義。最后, 我們在翻譯過程中依靠文化語境能判斷在原文中文化詞所含有的詞義。

"의방유취(醫方類聚)" 의 데이터베이스 구축 방안 (Establishment of database on ${\ulcorner}$Euibangyoochui醫方類聚${\lrcorner}$)

  • 안상우;신순식;이재원
    • 한국한의학연구원논문집
    • /
    • 제4권1호통권4호
    • /
    • pp.27-45
    • /
    • 1998
  • $\ulcorner$Euibangyoochui醫方類聚$\lrcorner$ (1445) is regarded as a treasure-house of the knowledge of traditional oriental medicine which contains over 50,000 prescriptions and enormerous amount of medical information. Despite the importance and information contained in this book, it has been rarely used since it was not convenient to use this book. In this study, therefore, the establishment of database on $\ulcorner$Euibangyoochui$\lrcorner$ was carried out. Before the database establishment of $\ulcorner$Euibangyoochui$\lrcorner$ , basic works such as correction, interpretation, proofreading and translation of original text should be done. The results obtained in this study are summaried as follows : 1) During the course of studying the original text of $\ulcorner$Euibangyoochui$\lrcorner$ , the editing process and transmission of medical books in early Chosun dynasty was figured out. 2) For better correction, interpretation, proofreading and translation of $\ulcorner$Euibangyoochui$\lrcorner$ , $\ulcorner$Euibangyoochui$\lrcorner$ microfilms which are the collection of Japanese Royal Library (宮內廳 圖書寮) were obtained in this study. Through this process, the errors in the republication were able to be corrected. 3) Analyzing the organization and compilatory method of $\ulcorner$Euibangyoochui$\lrcorner$ is one of the basic requirements of understanding the scale of the whole. book and establishing database as a result. So the analysis results were used for the basic structuring of database. 4) $\ulcorner$Euibangyoochui$\lrcorner$ CD- ROM was designed in a way that the images of microfilms, original text and Korean translation can be compared by 3-D device. In addition, the convenience and proficiency of imaging the information and prescriptions of the text is one of the remarkable features of this CD-ROM.

  • PDF

An Automatic Tagging System and Environments for Construction of Korean Text Database

  • Lee, Woon-Jae;Choi, Key-Sun;Lim, Yun-Ja;Lee, Yong-Ju;Kwon, Oh-Woog;Kim, Hiong-Geun;Park, Young-Chan
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.1082-1087
    • /
    • 1994
  • A set of text database is indispensable to the probabilistic models for speech recognition, linguistic model, and machine translation. We introduce an environment to canstruct text databases : an automatic tagging system and a set of tools for lexical knowledge acquisition, which provides the facilities of automatic part of speech recognition and guessing.

  • PDF

데버러 스미스(Deborah Smith)의 『채식주의자』 다시쓰기: 번역 관점에서 본 문제점 및 향후 과제 (Deborah Smith's Rewriting of Ch'aesikjuuija: Thoughts from a Translation Perspective)

  • 신혜정
    • 한국콘텐츠학회논문지
    • /
    • 제17권10호
    • /
    • pp.657-666
    • /
    • 2017
  • 본 논문은 2016년 맨부커 인터내셔널상(The Man Booker International Prize)을 수상한 The Vegetarian을 둘러싼 언론과 학계의 엇갈린 평가와 오역논란을 살펴보았다. 한강의 "채식주의자"와 데버러 스미스(Deborah Smith)의 The Vegetarian을 분석한 결과, The Vegetarian에서 다수의 오역사례를 확인할 수 있었다. 본 논문은 단순 오역 지적을 넘어 스미스가 '오역'하게 된 경위를 역으로 짚어 봄으로써 The Vegetarian 이 번역이 아닌 로컬리제이션(localization)으로 방향을 설정한 영어 현지화 텍스트임을 밝힌 데 의의가 있다. 결론에서는 스미스의 "채식주의자" 다시쓰기(rewriting)인 The Vegetarian의 성과를 논하고 독자 선택을 돕는 번역 방법을 제시함으로써 향후 한국문학의 영어번역이 나아갈 방향을 제안한다.