Search | Korea Science

Choi, Minseok;Lee, Choong-Ho
- Journal of the Institute of Convergence Signal Processing
- /
- v.14 no.2
- /
- pp.90-94
- /
- 2013
Research for Manchu character digitalization is at an early stage. This paper proposes a preprocessing algorithm for Manchu character recognition. This algorithm improves the existing Hilditch thinning algorithm so that it corrects thinning error for Manchu characters. The existing algorithm separates the characters into the left-hand side and right-hand side, while our alogorithm uses the central point between the points that strokes exist when it classifies each of characters. The experimentation results show that this method is valid for thinning and classification of Manchu characters.
PDF KSCI

Aaron Daniel Snowberger;Choong Ho Lee
- Journal of information and communication convergence engineering
- /
- v.22 no.1
- /
- pp.80-87
- /
- 2024
The Manchu language holds historical significance, but a complete dataset of Manchu script letters for training optical character recognition machine-learning models is currently unavailable. Therefore, this paper describes the process of creating a robust dataset of extracted Manchu script letters. Rather than performing automatic letter segmentation based on whitespace or the thickness of the central word stem, an image of the Manchu script was manually inspected, and one copy of the desired letter was selected as a region of interest. This selected region of interest was used as a template to match all other occurrences of the same letter within the Manchu script image. Although the dataset in this study contained only 4,000 images of five Manchu script letters, these letters were collected from twenty-eight writing styles. A full dataset of Manchu letters is expected to be obtained through this process. The collected dataset was normalized and trained using a simple convolutional neural network to verify its effectiveness.
https://doi.org/10.56977/jicce.2024.22.1.80 인용 PDF

Snowberger, Aaron Daniel;Lee, Choong Ho
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2021.05a
- /
- pp.617-619
- /
- 2021
Since Manchu characters are written vertically and are connected without spaces within a word, a preprocessing process is required to separate the character area and the units that make up the characters before recognizing the characters. In this paper, we describe a preprocessing method that extracts the character area and cuts off the unit of the character. Unlike existing research that presupposes a method of recognizing each word or character unit, or recognizing the remaining part after removing the stem of a continuous character, this method cuts the character into each recognizable unit. It can be applied to the method of recognizing letters by combining the units. Through an experiment, the effectiveness of this method was verified.
PDF