• Title/Summary/Keyword: Printed Hangu

Search Result 1, Processing Time 0.013 seconds

Construction of Printed Hangul Character Database PHD08 (한글 문자 데이터베이스 PHD08 구축)

  • Ham, Dae-Sung;Lee, Duk-Ryong;Jung, In-Suk;Oh, Il-Seok
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.11
    • /
    • pp.33-40
    • /
    • 2008
  • The application of OCR moves from traditional formatted documents to the web document and natural scene images. It is usual that the new applications use not only standard fonts of Myungjo and Godic but also various fonts. The conventional databases which have mainly been constructed with standard fonts have limitations in applying to the new applications. In this paper, we generate 243 image samples for each of 2350 Hangul character classes which differs in font size, quality, and resolution. Additionally each sample was varied according to binarization threshold and rotational transformation. Through this process 2187 samples were generated for each character class. Totally 5,139,450 samples constitutes the printed Hangul character database called the PHD08. In addition, we present the characteristics and recognition performance by an commercial OCR software.