Browse > Article

The Extraction of Table Lines and Data in Document Image  

Jang, Dae-Geun (특허청 전지전자심사본부)
Kim, Eui-Jeong (공주대학교 컴퓨터교육과)
Abstract
We should extract lines and data which consist of the table in order to classify the table region and analyze its structure in document image. But it is difficult to extract lines and data exactly because the lines are cut and their lengths are changed, or characters or noises are merged to the table lines. These problems result from the error of image input device or image reduction. In this paper, we propose the better method of extracting lines and data for table region classification and structure analysis than the previous ones including commercial softwares. The prposed method extracts horizontal and vertical lines which consist of the table by the use of one dimensional median filter. This filter not only eliminates the noises which attach to the line and the lines which are orthogonal to the filtering direction, but also connects the cut line of which the gap is shorter than the length of the filter tap in the process of extracting lines to the filtering direction. Furthermore, texts attached to the line are separated in the process of extracting vertical lines. This is an example of ABSTRACT format.
Keywords
character recognition; distortion correction of document image; image processing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 X. Li, W. Gao, S. Y. Chi, K. A. Moon and H. J. Kim, 'An Efficient Method for Page Segmentation,' Proc. ICICS, vol.2, pp.957-961, 1997
2 Ren Jean Liou and Mu-Song Chen, 'Recognition of Table-form Documents Using High Order Correlation Method,' Proc. Int. Joint Con! Neural Networks, vol.3, pp.1851-1856, 1998
3 T. Watanabe, Q. Luo and N. Sugie, 'Layout Recognition of Multi-Kinds of Table Form Documents,' IEEE Trans. Pattern Analysis and Machine Intelligence, vol.17, no.4, pp.432-445, 1995   DOI   ScienceOn
4 L. A. Pereira and J. Facon, 'Methodology of Automatic Extraction of Table-form Cells,' Proc. 8th Brazilian Symp. Computer Graphics and Image Processing, pp.15-21, 2000
5 Jain-Shiue Chen and Din-Chang Tseng, 'Overlapped Charter Separation and Reconstruction for Table-form Documents,' Proc. Int. Conf. Image Processing, vol.1 pp.233-236, 1996
6 D. Drivas and A. Amin, 'Page Segmentation and Classification Utilizing Bottom-up Approach,' Proc. ICDAR, pp.610-614, 1995
7 X. Li, J. Hong, Z. Zhang and B. Chen, 'A Statistical Form Reading System,' Proc. IEEE Region 10 Conf. Computer, Communication, Control and Power Engineering, vol.2 pp.1062-1065, 1993
8 L. huizhu, G. Agam and I. Dinstein, 'Directional Mathematical Mophology Approach for Line Thinning and Extraction of Character Strings from Maps and Line Drawings,' Proc. 3th Int. Con! Document Analysis and Recognition, vol.1 pp.257-260, 1995