• Title/Summary/Keyword: Korean Text Input Systems

Search Result 51, Processing Time 0.024 seconds

Automatic Drawing Input by Segmentation of Text Region and Recognltion of Geometric Drawing Element (문자영역의 분리와 기하학적 도면요소의 인식에 의한 도면 자동입력)

  • 배창석;민병우
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.6
    • /
    • pp.91-103
    • /
    • 1994
  • As CAD systems are introduced in the filed of engineering design, the necessities for automatic drawing input are increased . In this paper, we propose a method for realizing automatic drawing input by separation of text regions and graphic regions, extraction of line vectors from graphic regions, and recognition of circular arcs and circles from line vectors. Sizes of isolated regions, on a drawing are used for separating text regions and graphic regions. Thinning and maximum allowable error method are used to extract line vectors. And geometric structures of line vectors are analyzed to recognize circular arcs and circles. By processing text regions and graphic regions separately, 30~40% of vector information can be reduced. Recognition of circular arcs and circles can increase the utilization of automatic drawing input function.

  • PDF

Development of an image processing algorithm for korean document recognition (인식률을 향상한 한글문서 인식 알고리즘 개발)

  • 김희식;김영재;이평원
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.1391-1394
    • /
    • 1997
  • This paper proposes a new image processing algorithm to recognize korean documents. It take out the region of text area form input image, then it makes esgmentation of lines, words and characters in the text. A precision segmentation is very important to recognize the input document. The input image has 8-bit gray scaled resolution. Not only the histogram but also brightness dispersion graph are used for segmentation. The result shows a higher accuracy of document recognition.

  • PDF

A Real-Time Concept-Based Text Categorization System using the Thesauraus Tool (시소러스 도구를 이용한 실시간 개념 기반 문서 분류 시스템)

  • 강원석;강현규
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.1
    • /
    • pp.167-167
    • /
    • 1999
  • The majority of text categorization systems use the term-based classification method. However, because of too many terms, this method is not effective to classify the documents in areal-time environment. This paper presents a real-time concept-based text categorization system,which classifies texts using thesaurus. The system consists of a Korean morphological analyzer, athesaurus tool, and a probability-vector similarity measurer. The thesaurus tool acquires the meaningsof input terms and represents the text with not the term-vector but the concept-vector. Because theconcept-vector consists of semantic units with the small size, it makes the system enable to analyzethe text with real-time. As representing the meanings of the text, the vector supports theconcept-based classification. The probability-vector similarity measurer decides the subject of the textby calculating the vector similarity between the input text and each subject. In the experimentalresults, we show that the proposed system can effectively analyze texts with real-time and do aconcept-based classification. Moreover, the experiment informs that we must expand the thesaurustool for the better system.

User Authentication Based on Keystroke Dynamics of Free Text and One-Class Classifiers (자유로운 문자열의 키스트로크 다이나믹스와 일범주 분류기를 활용한 사용자 인증)

  • Seo, Dongmin;Kang, Pilsung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.4
    • /
    • pp.280-289
    • /
    • 2016
  • User authentication is an important issue on computer network systems. Most of the current computer network systems use the ID-password string match as the primary user authentication method. However, in password-based authentication, whoever acquires the password of a valid user can access the system without any restrictions. In this paper, we present a keystroke dynamics-based user authentication to resolve limitations of the password-based authentication. Since most previous studies employed a fixed-length text as an input data, we aims at enhancing the authentication performance by combining four different variable creation methods from a variable-length free text as an input data. As authentication algorithms, four one-class classifiers are employed. We verify the proposed approach through an experiment based on actual keystroke data collected from 100 participants who provided more than 17,000 keystrokes for both Korean and English. The experimental results show that our proposed method significantly improve the authentication performance compared to the existing approaches.

An Efficient Machine Learning-based Text Summarization in the Malayalam Language

  • P Haroon, Rosna;Gafur M, Abdul;Nisha U, Barakkath
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.6
    • /
    • pp.1778-1799
    • /
    • 2022
  • Automatic text summarization is a procedure that packs enormous content into a more limited book that incorporates significant data. Malayalam is one of the toughest languages utilized in certain areas of India, most normally in Kerala and in Lakshadweep. Natural language processing in the Malayalam language is relatively low due to the complexity of the language as well as the scarcity of available resources. In this paper, a way is proposed to deal with the text summarization process in Malayalam documents by training a model based on the Support Vector Machine classification algorithm. Different features of the text are taken into account for training the machine so that the system can output the most important data from the input text. The classifier can classify the most important, important, average, and least significant sentences into separate classes and based on this, the machine will be able to create a summary of the input document. The user can select a compression ratio so that the system will output that much fraction of the summary. The model performance is measured by using different genres of Malayalam documents as well as documents from the same domain. The model is evaluated by considering content evaluation measures precision, recall, F score, and relative utility. Obtained precision and recall value shows that the model is trustable and found to be more relevant compared to the other summarizers.

An end-to-end synthesis method for Korean text-to-speech systems (한국어 text-to-speech(TTS) 시스템을 위한 엔드투엔드 합성 방식 연구)

  • Choi, Yeunju;Jung, Youngmoon;Kim, Younggwan;Suh, Youngjoo;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.39-48
    • /
    • 2018
  • A typical statistical parametric speech synthesis (text-to-speech, TTS) system consists of separate modules, such as a text analysis module, an acoustic modeling module, and a speech synthesis module. This causes two problems: 1) expert knowledge of each module is required, and 2) errors generated in each module accumulate passing through each module. An end-to-end TTS system could avoid such problems by synthesizing voice signals directly from an input string. In this study, we implemented an end-to-end Korean TTS system using Google's Tacotron, which is an end-to-end TTS system based on a sequence-to-sequence model with attention mechanism. We used 4392 utterances spoken by a Korean female speaker, an amount that corresponds to 37% of the dataset Google used for training Tacotron. Our system obtained mean opinion score (MOS) 2.98 and degradation mean opinion score (DMOS) 3.25. We will discuss the factors which affected training of the system. Experiments demonstrate that the post-processing network needs to be designed considering output language and input characters and that according to the amount of training data, the maximum value of n for n-grams modeled by the encoder should be small enough.

An Effective Hangul Modification System Using Jamo Modification Window (자모 수정 창을 활용한 효과적인 한글 수정 시스템)

  • Ceong, Hyi-Thaek
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.10
    • /
    • pp.1535-1544
    • /
    • 2013
  • There are many Hangul input systems to input Korean letter on computer or smart phone. However, the existing systems need to be required more efforts to modify the already inputted letters. This research suggests the Hangul letter modification method which can modify letter effectively based on reusing the alphabets previously inputted. The Hangul modification system using "Jamo Modification Window" follows the composition principle of Hangul, and utilize the already inputted alphabets. It can be applicable to the existing input system without any modification using only "Jamo Modification Window". Especially, this system is very useful on smart phone with small screen.

Separation of Text and Non-text in Document Layout Analysis using a Recursive Filter

  • Tran, Tuan-Anh;Na, In-Seop;Kim, Soo-Hyung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.10
    • /
    • pp.4072-4091
    • /
    • 2015
  • A separation of text and non-text elements plays an important role in document layout analysis. A number of approaches have been proposed but the quality of separation result is still limited due to the complex of the document layout. In this paper, we present an efficient method for the classification of text and non-text components in document image. It is the combination of whitespace analysis with multi-layer homogeneous regions which called recursive filter. Firstly, the input binary document is analyzed by connected components analysis and whitespace extraction. Secondly, a heuristic filter is applied to identify non-text components. After that, using statistical method, we implement the recursive filter on multi-layer homogeneous regions to identify all text and non-text elements of the binary image. Finally, all regions will be reshaped and remove noise to get the text document and non-text document. Experimental results on the ICDAR2009 page segmentation competition dataset and other datasets prove the effectiveness and superiority of proposed method.

Development of Universal Reduced Key Braille System (유니버설 단축키 점자시스템 개발)

  • Lee, Jung-Suk;Moon, Byung-Hyun
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.2
    • /
    • pp.45-51
    • /
    • 2022
  • In this paper, an universal reduced input system that can represent Korean text message, English alphabet letter, special characters, and numbers is develpoed. The reduced keyboard input system has 5 number keys and 4 special function keys to reduce the complexity of inserting characters for the severely disabled. Also, mobile application is developed for the use of easy communication for the disabled.