Document Structure Understanding on Subjects Registration Table

  • Ito, Yuichi (Department of Electrical and Electronic Engineering, Faculty of Engineering, Mie University) ;
  • Ohno, Masanaga (Department of Electrical and Electronic Engineering, Faculty of Engineering, Mie University) ;
  • Tsuruoka, Shinji (Department of Electrical and Electronic Engineering, Faculty of Engineering, Mie University) ;
  • Yoshikawa, Tomohiro (Department of Electrical and Electronic Engineering, Faculty of Engineering, Mie University) ;
  • Tsuyoshi, Shinogi (Department of Electrical and Electronic Engineering, Faculty of Engineering, Mie University)
  • 발행 : 2003.09.01

초록

This research is aimed to automate the generating process of the database from paper based table forms like this work. The registration table has so complicate table structures, ana in this research we used the registration tables as an example of general table structure understanding. We propose a table structure understanding system for some table types, and it has some steps. The first step is that the document images on paper are read from the image scanner. The second step is that a document image segments into some tables. In the third step, the character strings is extracted using image processing technology and the property of the character strings is determined. And the structured database is generated automatically. The proposed system consists of two systems. "Master document generation system" is used for the table form definition, and it doesn′t include the handwritten characters. "Structure analysis system for complete d table" is used for the written form, and it analyzes the table form filled in the handwritten character. We implemented the system using MS Visual C++ on Windows, and it can get the correct extraction rate 98% among 51 registration tables written by the different students.

키워드