Journal of the Korean Institute of Telematics and Electronics C (전자공학회논문지C)
- Volume 35C Issue 2
- /
- Pages.65-70
- /
- 1998
- /
- 1226-5853(pISSN)
A postprocessing method for korean optical character recognition using eojeol information
어절 정보를 이용한 한국어 문자 인식 후처리 기법
Abstract
In this paper, we will to check and to correct mis-recognized word using Eojeol information. First, we divided into 16 classes that constituents in a Eojeol after we analyzed Korean statement into Eojeol units. Eojeol-Constituent state diagram constructed these constitutents, find the Left-Right Connectivity Information. As analogized the speech of connectivity information, reduced the number of cadidate words and restricted case of morphological analysis for mis-recognition Eojeol. Then, we improved correction speed uisng heuristic information as the adjacency information for Eojeol each other. In the correction phase, construct Reverse-Order Word Dictionary. Using this, we can trace word dictionary regardless of mis-recongnition word position. Its results show that improvement of recognition rate from 97.03% to 98.02% and check rate, reduction of chadidata words and morpholgical analysis cases.
Keywords