Browse > Article
http://dx.doi.org/10.5392/IJoC.2015.11.4.077

Table Detection from Document Image using Vertical Arrangement of Text Blocks  

Tran, Dieu Ni (School of Electronics and Computer Engineering Chonnam National University)
Tran, Tuan Anh (School of Electronics and Computer Engineering Chonnam National University)
Oh, Aran (School of Electronics and Computer Engineering Chonnam National University)
Kim, Soo Hyung (School of Electronics and Computer Engineering Chonnam National University)
Na, In Seop (School of Electronics and Computer Engineering Chonnam National University)
Publication Information
Abstract
Table detection is a challenging problem and plays an important role in document layout analysis. In this paper, we propose an effective method to identify the table region from document images. First, the regions of interest (ROIs) are recognized as the table candidates. In each ROI, we locate text components and extract text blocks. After that, we check all text blocks to determine if they are arranged horizontally or vertically and compare the height of each text block with the average height. If the text blocks satisfy a series of rules, the ROI is regarded as a table. Experiments on the ICDAR 2013 dataset show that the results obtained are very encouraging. This proves the effectiveness and superiority of our proposed method.
Keywords
Table Detection; Text Block; Expanding ROI; Vertical Arrangement;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Zhouchen Lin, Junfeng He, Zhicheng Zhong, and Rongrong Wang, “Table detection in online ink notes,” IEEE Trans Pattern Anal Mach Intell, 2006, pp. 1341-1346.
2 T Kasar, P Barlas, S Adam , C Chatelain, and T Paquet, “Learning to Detect Tables in Scanned Document Images Using Line Information,” Document Analysis and Recognition (ICDAR), 2013, pp. 1185-1189.
3 Wonkyo Seo, Hyung Il Koo, and Nam Ik Cho, “Junction-based table detection in camera-captured document images,” International Journal on Document Analysis and Recognition 2015, pp. 47-57.   DOI
4 Ying Liu, “A Fast Preprocessing Method for Table Boundary Detection: Narrowing Down the Sparse Lines using Solely Coordinate Information,” Document Analysis Systems, DAS '08, The Eighth IAPR International Workshop on, 2008, pp. 431-438.
5 Ying Liu, “A Fast Preprocessing Method for Table Boundary Detection: Narrowing Down the Sparse Lines using Solely Coordinate Information,” Document Analysis Systems, DAS '08, The Eighth IAPR International Workshop on, 2008, pp. 431-438.
6 Yalin Wang, Ihsin T. Phillips, and Robert M. Haralick, “Table Detection via Probability Optimization,” Document Analysis Systems V, pp. 272-282.
7 Tanushree Dhiran and Rakesh Sharma, “Table Detection and Extraction from Image Document,” International Journal of Computer & Organization Trends, vol. 3, issue 7, Aug. 2013, pp. 275-278.
8 Max Gobel, Tamir Hassan, Ermelinda Oro, and Giorgio Orsi, “ICDAR 2013 Table Competition,” 2013 12th International Conference on Document Analysis and Recognition, pp. 1449-1453.
9 Jing Fang, Liangcai Gao, Kun Bai, Ruiheng Qiu, Xin Tao, and Zhi Tang, “A Table Detection Method for Multipage PDF Documents via Visual Seperators and Tabular Structures,” 2011 International Conference on Document Analysis and Recognition, pp. 799-783.
10 J. Sauvola and M. PietikaKinen, "Adaptive document image binarization," Pattern Recognition 33, 2000, pp. 225-236.   DOI
11 Rafael C. Gonzalez, Richard E. Woods, and Prentice Hall, Digital Image Processing (3rd Edition), 3 edition (August 31, 2007), Chapter 9 Morphological Image Processing, pp. 627-680.
12 A. C. e Silva, Parts that add up to a whole: a framework for the analysis of tables, Ph.D. dissertation, The University of Edinburgh, 2010.
13 B. Yildiz, K. Kaiser, and S. Miksch, "pdf2table: A method to extract table information from pdf files," in IICAI, 2005, pp. 1773-1785.
14 H. Strobelt, D. Oelke, C. Rohrdantz, A. Stoffel, D. A. Keim, and O. Deussen, “Document cards: A top trumps visualization for documents,” IEEE Trans. Vis. Comput. Graph, vol. 15, no. 6, 2009, pp. 1145-1152.   DOI
15 A. Stoffel, D. Spretke, H. Kinnemann, and D. A. Keim, "Enhancing document structure analysis using visual analytics," in SAC, 2010, pp. 8-12.
16 Jing Fang, Prasenjit Mitra, Zhi Tang, and C. Lee Giles, “Table Header Detection and Classification,” Association for the Advancement of Artificial Intelligence, 2012, pp. 599-605.
17 Ying Liu, Kun Bai, Prasenjit Mitra, and C. Lee Giles, “Improving the Table Boundary Detection in PDFs by Fixing the Sequence Error of the Sparse Lines,” Document Analysis and Recognition, 2009- ICDAR '09, pp. 1006-1010.
18 Ying Liu, Kun Bai, Prasenjit Mitra, and C. Lee Giles, “TableSeer: automatic table metadata extraction and searching in digital libraries,” JCDL '07 Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, pp. 91-100.
19 B. Gatos, D. Danatsas, I. Pratikakis, and S. J. Perantonis, “Automatic Table Detection in Document Images, Pattern Recognition and Data Mining,” Lecture Notes in Computer Science, vol. 3686, 2005, pp. 609- 618.   DOI
20 Anukriti Bansal, Gaurav Harit, and Sumantra Dutta Roy, "Table Extraction from Document Imag es using Fixed Point Model," Indian Conference on Computer Vision Graphics and Image Processing, 2014, Article no. 67.