• Title/Summary/Keyword: Korean Character Recognition

Search Result 574, Processing Time 0.026 seconds

Recognition of Virtual Written Characters Based on Convolutional Neural Network

  • Leem, Seungmin;Kim, Sungyoung
    • Journal of Platform Technology
    • /
    • v.6 no.1
    • /
    • pp.3-8
    • /
    • 2018
  • This paper proposes a technique for recognizing online handwritten cursive data obtained by tracing a motion trajectory while a user is in the 3D space based on a convolution neural network (CNN) algorithm. There is a difficulty in recognizing the virtual character input by the user in the 3D space because it includes both the character stroke and the movement stroke. In this paper, we divide syllable into consonant and vowel units by using labeling technique in addition to the result of localizing letter stroke and movement stroke in the previous study. The coordinate information of the separated consonants and vowels are converted into image data, and Korean handwriting recognition was performed using a convolutional neural network. After learning the neural network using 1,680 syllables written by five hand writers, the accuracy is calculated by using the new hand writers who did not participate in the writing of training data. The accuracy of phoneme-based recognition is 98.9% based on convolutional neural network. The proposed method has the advantage of drastically reducing learning data compared to syllable-based learning.

Recognition of the Printed English Sentence by Using Japanese Puzzle

  • Sohn, Young-Sun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.8 no.3
    • /
    • pp.225-230
    • /
    • 2008
  • In this paper we embody a system that recognizes printed alphabet, numeral figures and symbols written on the keyboard for the recognition of English sentences. The image of the printed sentences is inputted and binarized, and the characters are separated by using histogram method that is the same as the existing character recognition method. During the abstraction of the individual characters, we classify one group that has not numerical information by the projection of the vertical center of the character. In case of another group that has the longer width than the height, we assort them by normalizing the width. The other group normalizes the height of the images. With the reverse application of the basic principle of the Japanese Puzzle to a normalized character image, the proposed system classifies and recognizes the printed numeral figures, symbols and characters, consequently we meet with good result.

The Centering of the Invariant Feature for the Unfocused Input Character using a Spherical Domain System

  • Seo, Choon-Weon
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.29 no.9
    • /
    • pp.14-22
    • /
    • 2015
  • TIn this paper, a centering method for an unfocused input character using the spherical domain system and the centering character to use the shift invariant feature for the recognition system is proposed. A system for recognition is implemented using the centroid method with coordinate average values, and the results of an above 78.14% average differential ratio for the character features were obtained. It is possible to extract the shift invariant feature using spherical transformation similar to the human eyeball. The proposed method, which is feature extraction using spherical coordinate transform and transformed extracted data, makes it possible to move the character to the center position of the input plane. Both digital and optical technologies are mixed using a spherical coordinate similar to the 3 dimensional human eyeball for the 2 dimensional plane format. In this paper, a centering character feature using the spherical domain is proposed for character recognition, and possibilities for the recognized possible character shape as well as calculating the differential ratio of the centered character using a centroid method are suggested.

Handwritten Korean Amounts Recognition in Bank Slips using Rule Information (규칙 정보를 이용한 은행 전표 상의 필기 한글 금액 인식)

  • Jee, Tae-Chang;Lee, Hyun-Jin;Kim, Eun-Jin;Lee, Yill-Byung
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2400-2410
    • /
    • 2000
  • Many researches on recognition of Korean characters have been undertaken. But while the majority are done on Korean character recognition, tasks for developing document recognition system have seldom been challenged. In this paper, I designed a recognizer of Korean courtesy amounts to improve error correction in recognized character string. From the very first step of Korean character recognition, we face the enormous scale of data. We have 2350 characters in Korean. Almost the previous researches tried to recognize about 1000 frequently-used characters, but the recognition rates show under 80%. Therefore using these kinds of recognizers is not efficient, so we designed a statistical multiple recognizer which recognize 16 Korean characters used in courtesy amounts. By using multiple recognizer, we can prevent an increase of errors. For the Postprocessor of Korean courtesy amounts, we use the properties of Korean character strings. There are syntactic rules in character strings of Korean courtesy amounts. By using this property, we can correct errors in Korean courtesy amounts. This kind of error correction is restricted only to the Korean characters representing the unit of the amounts. The first candidate of Korean character recognizer show !!i.49% of recognition rate and up to the fourth candidate show 99.72%. For Korean character string which is postprocessed, recognizer of Korean courtesy amounts show 96.42% of reliability. In this paper, we suggest a method to improve the reliability of Korean courtesy amounts recognition by using the Korean character recognizer which recognize limited numbers of characters and the postprocessor which correct the errors in Korean character strings.

  • PDF

Character recognition using Hough transform (Hough변환을 이용한 문자인식)

  • 강선미;김봉석;황승옥;양윤모;김덕진
    • Proceedings of the Korean Institute of Communication Sciences Conference
    • /
    • 1991.10a
    • /
    • pp.77-80
    • /
    • 1991
  • This paper proposes a new feature extraction method which is effectively used in character recognition, and validate the effectiveness through various computational methods for similiarity degree. To get feature vectors used in this method, Hough transform is applied to character image, which is used for edge extraction in image processing. By that transformation technique, strokes could be extracted and feature vectors constructed suitably. The characteristic of this method is solving the difficulties in stroke extraction through transform space analysis, which is induced by noise and blurring, and representing high recognition rate 99.3% within 10 candidates in relative low dimension.

A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning

  • Manoj K;Iyapparaja M
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1540-1561
    • /
    • 2024
  • This research proposes a novel approach for Tamil Handwritten Character Recognition (THCR) that combines feature selection and ensemble learning techniques. The Tamil script is complex and highly variable, requiring a robust and accurate recognition system. Feature selection is used to reduce dimensionality while preserving discriminative features, improving classification performance and reducing computational complexity. Several feature selection methods are compared, and individual classifiers (support vector machines, neural networks, and decision trees) are evaluated through extensive experiments. Ensemble learning techniques such as bagging, and boosting are employed to leverage the strengths of multiple classifiers and enhance recognition accuracy. The proposed approach is evaluated on the HP Labs Dataset, achieving an impressive 95.56% accuracy using an ensemble learning framework based on support vector machines. The dataset consists of 82,928 samples with 247 distinct classes, contributed by 500 participants from Tamil Nadu. It includes 40,000 characters with 500 user variations. The results surpass or rival existing methods, demonstrating the effectiveness of the approach. The research also offers insights for developing advanced recognition systems for other complex scripts. Future investigations could explore the integration of deep learning techniques and the extension of the proposed approach to other Indic scripts and languages, advancing the field of handwritten character recognition.

Recognize Handwritten Urdu Script Using Kohenen Som Algorithm

  • Khan, Yunus;Nagar, Chetan
    • International Journal of Ocean System Engineering
    • /
    • v.2 no.1
    • /
    • pp.57-61
    • /
    • 2012
  • In this paper we use the Kohonen neural network based Self Organizing Map (SOM) algorithm for Urdu Character Recognition. Kohenen NN have more efficient in terms of performance as compare to other approaches. Classification is used to recognize hand written Urdu character. The number of possible unknown character is reducing by pre-classification with respect to subset of the total character set. So the proposed algorithm is attempt to group similar character. Members of pre-classified group are further analyzed using a statistical classifier for final recognition. A recognition rate of around 79.9% was achieved for the first choice and more than 98.5% for the top three choices. The result of this paper shows that the proposed Kohonen SOM algorithm yields promising output and feasible with other existing techniques.

A Study on the Printed Korean and Chinese Character Recognition (인쇄체 한글 및 한자의 인식에 관한 연구)

  • 김정우;이세행
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.11
    • /
    • pp.1175-1184
    • /
    • 1992
  • A new classification method and recognition algorithms for printed Korean and Chinese character is studied for Korean text which contains both Korean and Chinese characters. The proposed method utilizes structural features of the vertical and horizontal vowel in Korean character. Korean characters are classified into 6 groups. Vowel and consonant are separated by means of different vowel extraction methods applied to each group. Time consuming thinning process is excluded. A modified crossing distance feature is measured to recognize extracted consonant. For Chinese character, an average of stroke crossing number is calculated on every characters, which allows the characters to be classified into several groups. A recognition process is then followed in terms of the stroke crossing number and the black dot rate of character. Classification between Korean and Chinese character was at the rate of 90.5%, and classification rate of Ming-style 2512 Korean characters was 90.0%. The recognition algorithm was applied on 1278 characters. The recognition rate was 92.2%. The densest class after classification of 4585 Chinese characters was found to contain only 124 characters, only 1/40 of total numbers. The recognition rate was 89.2%.

  • PDF

Implementation of Multiprocessor for Classification of High Speed OCR (고속 문자 인식기의 대분류용 다중 처리기의 구현)

  • 김형구;강선미;김덕진
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.6
    • /
    • pp.10-16
    • /
    • 1994
  • In case of off-line character recognition with statistical method, the character recognition speed for Korean or Chinese characters is slow since the amount of calculation is huge. To improve this problem, we seperate the recognition steps into several functional stages and implement them with hardwares for each stage so that all the stages can be processed with pipline structure. In accordance with temporal parallel processing, a high speed character recognition system can be implemented. In this paper, we implement a classification hardware, which is one of the several functional stages, to improve the speed by parallel structure with multiple DSPs(Digital Signal Processors). Also, it is designed to be able to expand DSP boards in parallel to make processing faster as much as we wish. We implement the hardware as an add-on board in IBM-PC, and the result of experiment is that it can process about 47-times and 71-times faster with 2 DSPs and 3 DSPs respectively than the IBM-PC(486D$\times$2-66MHz). The effectiveness is proved by developing a high speed OCR(Optical Character Recognizer).

  • PDF

A Study on the Fractal Attractor Creation and Analysis of the Printed Korean Characters

  • Shon, Young-Woo
    • Journal of information and communication convergence engineering
    • /
    • v.1 no.1
    • /
    • pp.53-57
    • /
    • 2003
  • Chaos theory is a study researching the irregular, unpredictable behavior of deterministic and non-linear dynamical system. The interpretation using Chaos makes us evaluate characteristic existing in status space of system by tine series, so that the extraction of Chaos characteristic understanding and those characteristics enables us to do high precision interpretation. Therefore, This paper propose the new method which is adopted in extracting character features and recognizing characters using the Chaos Theory. Firstly, it gets features of mesh feature, projection feature and cross distance feature from input character images. And their feature is converted into time series data. Then using the modified Henon system suggested in this paper, it gets last features of character image after calculating Box-counting dimension, Natural Measure, information bit and information dimension which are meant fractal dimension. Finally, character recognition is performed by statistically finding out the each information bit showing the minimum difference against the normalized pattern database. An experimental result shows 99% character classification rates for 2,350 Korean characters (Hangul) using proposed method in this paper.