Browse > Article

Automatic Title Detection by Spatial Feature and Projection Profile for Document Images  

Park, Hyo-Jin (영남대학교 컴퓨터공학과)
Kim, Bo-Ram (영남대학교 컴퓨터공학과)
Kim, Wook-Hyun (영남대학교 컴퓨터공학과)
Publication Information
Journal of the Institute of Convergence Signal Processing / v.11, no.3, 2010 , pp. 209-214 More about this Journal
Abstract
This paper proposes an algorithm of segmentation and title detection for document image. The automated title detection method that we have developed is composed of two phases, segmentation and title area detection. In the first phase, we extract and segment the document image. To perform this operation, the binary map is segmented by combination of morphological operation and CCA(connected component algorithm). The first phase provides segmented regions that would be detected as title area for the second stage. Candidate title areas are detected using geometric information, then we can extract the title region that is performed by removing non-title regions. After classification step that removes non-text regions, projection is performed to detect a title region. From the fact that usually the largest font is used for the title in the document, horizontal projection is performed within text areas. In this paper, we proposed a method of segmentation and title detection for various forms of document images using geometric features and projection profile analysis. The proposed system is expected to have various applications, such as document title recognition, multimedia data searching, real-time image processing and so on.
Keywords
Document image analysis; Document image segmentation; Connected component analysis; Detection of title region; Projection profile analysis;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 D. Wang and S. N. Srihari, "Classification of newspaper image blocks using texture analysis," Computer Vision, Graphics, and Image Processing, vol. 47, pp.327-352, Jan. 1989.   DOI   ScienceOn
2 S. Nomura, K. Yamanaka, T. Shiose, H. Kawakami, O. Katai, "Morphological preprocessing method to thresholding degraded word images", Pattern Recognition Letters, vol. 30, pp.729-744, Jun. 2009   DOI   ScienceOn
3 B. R. Kim, J. T. Oh, W. H. Kim, "Segmentation and Contents Classification of Document Images Using Local Entropy and Texture-based PCA Algorithm", Korea Information Processing Society, 2009.
4 K. Jung, JH. Han, "Hybrid approach to efficient text extraction in complex color images", Pattern Recognition Letters, vol. 25, pp.679-699, Apr. 2009.
5 Yi Xiao, Hong Yan, "Location of title and author regions in document images based on the Delaunay triangulation", Image and Vision Computing, Vol. 22, pp.679-699, 2004.
6 서정, 김보람, 오준택, 김욱현, "텍스쳐 기반 BP 신경망을 이용한 위성영상의 도로영역 추출", 한국신호처리시스템학회논문지, v.10, no.3, pp.164-169, 2009.   과학기술학회마을
7 B. Wang, XF. Li, F. Liu, FQ. Hu, "Color text image binarization based on binary texture analysis", Pattern Recognition Letters, vol. 26, pp.1650-1657, Aug. 2005   DOI   ScienceOn
8 Y. Y. Tang, C. D. Yan, and C. Y. Suen, "Document Processing for Automatic Knowledge Acquition," IEEE Trans. on Knowledge and Data Engineering, Vol. 6, No. 1, pp.3-21, Feb. 1994.   DOI   ScienceOn
9 N. Otsu, "A threshold selection method from gray level histograms," IEEE Trans. on Syst. Man Cybern. vol.9, no.1, pp.62-66, 1979
10 F. M. Wahi K. Y. Wong, and R. G. Casey, "Block segmentation and text extraction in mixed text/image documents," Computer Graphics and Image Processing, vol. 22, pp.375-390, Feb. 1982.
11 곽희규, "문서 영상의 단어 단위 분할 및 단어 영상의 속성 추출에 관한 연구," 전남대학교 대학원 전산통계학과 학위논문, 2001.