Browse > Article

The Geometric Layout Analysis of the Document Image Using Connected Components Method and Median Filter  

Jang, Dae-Geun (경북대학교 전자${\cdot}$전기${\cdot}$컴퓨터학부 데이터 통신 시스템 연구실)
Hwang, Chan-Sik (경북대학교 전자${\cdot}$전기${\cdot}$컴퓨터학부 데이터 통신 시스템 연구실)
Abstract
Document image should be classified into detailed regions as text, picture, table and etc through the geometric layout analysis if paper documents can be converted automatically into electronic documents. However, complexity of the document layout and variety of the size and density of a picture are the reason to make it difficult to analyze the geometric layout of the document images. In this paper, we propose the method which have a better performance of the region segmentation and classifications, and the line extraction in the table region than the commercial softwares and previous methods. The proposed method can segment the document into detailed regions by using connected components method even if its layout is complex. This method also classifies texts and pictures by using separable median filter even. Though their size and density are diverse, In addition, this method extracts the lines from the table adapting one dimensional median filter to the each horizontal and vertical direction, even though lines are deformed or texts attached to them.
Keywords
Citations & Related Records
연도 인용수 순위
  • Reference
1 X. Li, W. Gao, S. Y. Chi, K. A. Moon and H.J. Kim, 'An Efficient Method for PageSegmentation,' Proc. ICICS, vo1.2, pp.957-961,1997
2 N. Otsu, 'A Threshold Selection Method From Gray-level Histograms,' IEEE Trans. Systems,Man, and Cybernetics, vol. SMC-9, No.1, pp.62-66, 1979
3 S. K. Yip and Z. Chi, 'Page Segmentation andContent Classification for Automatic DocumentImage Processing,' Proc. Int. Symp. IntelligentMultimedia, Video and Speech Processing,pp.279-282, 2001
4 Jain-Shiue Chen and Din-Chang Tseng, 'Over-lapped Charter Separation and Reconstmctionfor Table form Documents,' Proc. Int. Conf.Image Processing, vol.1, pp.233-236, 1996
5 X. Li, J. Hong, Z. Zhang and B. Chen, 'A Statistical Form Reading System,' Proc. IEEERegion 10 Conf. Computer, Communication,Control and Power Engineering, vo1.2pp. 1062-1065, 1993
6 Mario I. Chacon Murguia, 'Document Segmen-tation Using Texture Variance and LowResoludon Images,' IEEE Southwest. Symp.Image Analysis and Interpretation, pp.164-167,1998
7 D. Drivas and A. Amin, 'Page Segmentationand Classification Utilizing Bottom-up Appro-ach,' Proc. ICDAR, pp.610-614, 1995
8 장명욱, 천대녕, 양현숭, '연결화소를 이용한문서영상의 분할 및 인식,' 한국정보과학회 논문지, 제 20권, 제 12호, pp.1741-1751, 1993
9 이인동, 권오석, 김태균, '문서 영상에서 문자와 비문자의 분리추출 방법,' 한국정보과학회 논문지, 제17권 제 3호, pp.247-258, 1990
10 J. Kong and Z. Chi, 'Image Classification UsingKolmogorov Complexity Measure withExtracted Blocks,' IEICE Trans. Inf. & Syst.,Vol.1, E81-D, pp. 1239-1246, 1998