• Title/Summary/Keyword: text translation

Search Result 148, Processing Time 0.021 seconds

The Character Recognition System of Mobile Camera Based Image (모바일 이미지 기반의 문자인식 시스템)

  • Park, Young-Hyun;Lee, Hyung-Jin;Baek, Joong-Hwan
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.5
    • /
    • pp.1677-1684
    • /
    • 2010
  • Recently, due to the development of mobile phone and supply of smart phone, many contents have been developed. Especially, since the small-sized cameras are equiped in mobile devices, people are interested in the image based contents development, and it also becomes important part in their practical use. Among them, the character recognition system can be widely used in the applications such as blind people guidance systems, automatic robot navigation systems, automatic video retrieval and indexing systems, automatic text translation systems. Therefore, this paper proposes a system that is able to extract text area from the natural images captured by smart phone camera. The individual characters are recognized and result is output in voice. Text areas are extracted using Adaboost algorithm and individual characters are recognized using error back propagated neural network.

Deep Neural Architecture for Recovering Dropped Pronouns in Korean

  • Jung, Sangkeun;Lee, Changki
    • ETRI Journal
    • /
    • v.40 no.2
    • /
    • pp.257-265
    • /
    • 2018
  • Pronouns are frequently dropped in Korean sentences, especially in text messages in the mobile phone environment. Restoring dropped pronouns can be a beneficial preprocessing task for machine translation, information extraction, spoken dialog systems, and many other applications. In this work, we address the problem of dropped pronoun recovery by resolving two simultaneous subtasks: detecting zero-pronoun sentences and determining the type of dropped pronouns. The problems are statistically modeled by encoding the sentence and classifying types of dropped pronouns using a recurrent neural network (RNN) architecture. Various RNN-based encoding architectures were investigated, and the stacked RNN was shown to be the best model for Korean zero-pronoun recovery. The proposed method does not require any manual features to be implemented; nevertheless, it shows good performance.

Word Sense Disambiguation in Query Translation of CLTR (교차 언어 문서 검색에서 질의어의 중의성 해소 방법)

  • Kang, In-Su;Lee, Jong-Hyeok;Lee, Geun-Bae
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.52-58
    • /
    • 1997
  • 정보 검색에서는 질의문과 문서를 동일한 표현으로 변환시켜 관련성을 비교하게 된다. 특히 질의문과 문서의 언어가 서로 다른 교차 언어 문서 검색 (CLTR : Cross-Language Text Retrieval) 에서 이러한 변환 과정은 언어 변환을 수반하게 된다. 교차 언어 문서 검색의 기존 연구에는 사전, 말뭉치, 기계 번역 등을 이용한 방법들이 있다. 일반적으로 언어간 변환에는 필연적으로 의미의 중의성이 발생되며 사전에 기반한 기존 연구에서는 다의어의 중의성 의미해소를 고려치 않고 있다. 본 연구에서는 질의어의 언어 변환시 한-일 대역어 사전 및 카도가와 시소러스 (각천(角川) 시소러스) 에 기반한 질의어 중의성 해소 방법과 공기하는 대역어를 갖는 문서에 가중치를 부여하는 방법을 제안한다. 제안된 방법들은 일본어 특허 문서를 대상으로 실험하였으며 5 %의 정확도 향상을 얻을 수 있었다.

  • PDF

Study on HuatuoXuanmenNeizhaotu in Processing of Medicinal ("화타현문내조도(華陀玄門內照圖)"의 약물포제(藥物炮製)에 대한 고찰(考察))

  • Sim, Hyun-A;Hwang, Seong-Yeon;Eom, Dong-Myung
    • Journal of Korean Medical classics
    • /
    • v.25 no.2
    • /
    • pp.75-88
    • /
    • 2012
  • Objective : Huatuoxuanmenneizhaotu(華陀玄門內照圖) is a Huatuo's book in two volumes, The second volume classifies poisonous and nonpoisonous medicines with explaining processing of medicinals. We, authors have concern on processing of medicinals in Huatuoxuanmenneizhaotu. Methods : Through Huatuoxuanmenneizhaotu text translation, we will try to categorize four ways : classifying 1) poisonous and nonpoisonous medicines, 2) methods of making medicines, 3) processing of medicinals using weter and fire and 4) methods of supplements in processing of medicinals. Result : There are some miss-matching in poisonous and nonpoisonous medicines category in Huatuoxuanmenneizhaotu comparing with Bencaogangmu. There are several methods in making medicines, processing of medicinals and supplements in processing of medicinals. Conclusion : These results explain that processing of medicinals in Huatuoxuanmenneizhaotu were really diverse and various.

Translation study on the Gageum's Sanghanbuik (가금(柯琴) "상한부익(傷寒附翼)" 번역(飜譯) 연구(硏究))

  • Jeong, Chang-Hyun;Jang, Woo-Chang
    • Journal of Korean Medical classics
    • /
    • v.18 no.3 s.30
    • /
    • pp.183-206
    • /
    • 2005
  • 'Sanghallonju'(傷寒論注) reorganized the formation according to method of 'the classification of similar symptoms' and annotated the text of Sanghallon, introducing his new methodology and 'Sanghallonik'(傷寒論翼) proclaimed his new finding of the science of the Sanghan. Meanwhile, 'Sanghanbuik' (傷寒附翼) explains various prescriptions in the 'Sanghallon'. It categorizes prescriptions according to the six Meridians and sum up Gageum's research by commenting on the target symtoms and the use of medicine on each prescriptions. Gageum's study is consistent in desire for embodying the universality of the differentiation of syndromes in accordance with the theory of the six Meridians.(六經辨證) in the medical scene. From his work, the substantiality of the six 'Sanghandbuik' is a publication that shows the essence of Gageum's medical science from his inclination, conclusion and concrete methodology.

  • PDF

Translation of RDF to VRML (RDF - VRML 변환)

  • Kim, Hye-Yeon;Park, Kin;Cho, Dong-Sub
    • Proceedings of the KIEE Conference
    • /
    • 2000.11d
    • /
    • pp.830-832
    • /
    • 2000
  • XML형식으로 표현된 RDF data를 VRML을 사용하여 시각적으로 나타내는 방법을 연구하였다. 현재 Web 환경은 동적으로 문서를 생성하고 Visual하게 표현하는 방향으로 발전하고 있으며 이러한 환경에서 XML은 실시간으로 data를 생성하기 쉬워 많이 사용되고 있다. 그러나 XML은 text 기반이기 때문에 data를 가시화하여 사용자한테 보여주기 힘들며 data를 표현하는데 너무 많은 융통성을 제공하고 있다는 단점이 있다. 이에 XML 표현에 제약을 둬 표준적인 방식으로 표현하도록 해주는 RDF가 유용하다고 할 수 있다. 본 논문에서는 VRML을 RDF와 결합하여 실시간으로 변하는 data를 시각화 도구를 사용하여 표현하는 방법에 대해 연구를 하였다 이를 위하여 Java Servlet을 사용하였으며 RDF 문서에서 data를 추출하여 VRML 펀드를 만들고. 그 코드를 사용자측에 전달하여 시각적으로 data를 볼 수 있도록 하는 시스템을 구현하였다.

  • PDF

Construction of Korean FrameNet through Manual Translation of English FrameNet (영어 FrameNet의 수동번역을 통한 한국어 FrameNet 구축 개발)

  • Nam, Sejin;Kim, Youngsik;Park, Jungyeul;Hahm, Younggyun;Hwang, Dosam;Choi, Key-Sun
    • Annual Conference on Human and Language Technology
    • /
    • 2014.10a
    • /
    • pp.38-43
    • /
    • 2014
  • 본 논문은, 현존하는 영어 FrameNet 데이터를 기반으로 하여, FrameNet에 대한 전문 지식이 없는 번역가들을 통해 수행할 수 있는 한국어 FrameNet의 수동 구축 개발 과정을 제시한다. 우리 연구팀은 실제로, NLTK가 제공하는 영어 FrameNet 버전 1.5의 Full Text를 이루고 있는 5,945개의 문장들 중에서, Frame 데이터를 가진 4,025개의 문장들을 추출해내어, 번역가들에 의해 한국어로 수동번역 함으로써, 한국어 FrameNet 구축 개발을 향한 의미 있는 초석을 마련하였으며, 제시한 방법의 실효성을 입증하는 연구결과들을 웹에 공개하기도 하였다.

  • PDF

Discrete Wavelet Transform for Watermarking Three-Dimensional Triangular Meshes from a Kinect Sensor

  • Wibowo, Suryo Adhi;Kim, Eun Kyeong;Kim, Sungshin
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.14 no.4
    • /
    • pp.249-255
    • /
    • 2014
  • We present a simple method to watermark three-dimensional (3D) triangular meshes that have been generated from the depth data of the Kinect sensor. In contrast to previous methods, which maintain the shape of 3D triangular meshes and decide the embedding place, requiring calculations of vertices and their neighbors, our method is based on selecting one of the coordinate axes. To maintain shape, we use discrete wavelet transform and constant regularization. We know that the watermarking system needs the information to be embedded; we used a text to provide that information. We used geometry attacks such as rotation, scales, and translation, to test the performance of this watermarking system. Performance parameters in this paper include the vertices error rate (VER) and bit error rate (BER). The results from the VER and BER indicate that using a correction term before the extraction process makes our system robust to geometry attacks.

A Study on the Performance Analysis of Entity Name Recognition Techniques Using Korean Patent Literature

  • Gim, Jangwon
    • Journal of Advanced Information Technology and Convergence
    • /
    • v.10 no.2
    • /
    • pp.139-151
    • /
    • 2020
  • Entity name recognition is a part of information extraction that extracts entity names from documents and classifies the types of extracted entity names. Entity name recognition technologies are widely used in natural language processing, such as information retrieval, machine translation, and query response systems. Various deep learning-based models exist to improve entity name recognition performance, but studies that compared and analyzed these models on Korean data are insufficient. In this paper, we compare and analyze the performance of CRF, LSTM-CRF, BiLSTM-CRF, and BERT, which are actively used to identify entity names using Korean data. Also, we compare and evaluate whether embedding models, which are variously used in recent natural language processing tasks, can affect the entity name recognition model's performance improvement. As a result of experiments on patent data and Korean corpus, it was confirmed that the BiLSTM-CRF using FastText method showed the highest performance.

Analyzing User Feedback on a Fan Community Platform 'Weverse': A Text Mining Approach

  • Thi Thao Van Ho;Mi Jin Noh;Yu Na Lee;Yang Sok Kim
    • Smart Media Journal
    • /
    • v.13 no.6
    • /
    • pp.62-71
    • /
    • 2024
  • This study applies topic modeling to uncover user experience and app issues expressed in users' online reviews of a fan community platform, Weverse on Google Play Store. It allows us to identify the features which need to be improved to enhance user experience or need to be maintained and leveraged to attract more users. Therefore, we collect 88,068 first-level English online reviews of Weverse on Google Play Store with Google-Play-Scraper tool. After the initial preprocessing step, a dataset of 31,861 online reviews is analyzed using Latent Dirichlet Allocation (LDA) topic modeling with Gensim library in Python. There are 5 topics explored in this study which highlight significant issues such as network connection error, delayed notification, and incorrect translation. Besides, the result revealed the app's effectiveness in fostering not only interaction between fans and artists but also fans' mutual relationships. Consequently, the business can strengthen user engagement and loyalty by addressing the identified drawbacks and leveraging the platform for user communication.