A Knowledge-based System for Analyzing Sophisticated Geometric Structure of Document Images

문서 영상의 정교한 기하적 구조분석을 위한 지식베이스 시스템

  • Published : 2001.11.01

Abstract

Sophisticated geometric structure analysis must be preceded to create electronic document from logical components extracted from document image. this paper presents a knowledge-based method for sophisticated geometric structure analysis of technical journal pages. The proposed knowledge base encodes geometric characteristics that are not only common in technical journals but also publication-specific in the form rules. The method takes the hybrid of top-down and bottom-up techniques and consists of two phases: region segmentation and identification. Generally, the result of segmentation process does not have a one-to-one matching with composite layout components. Therefore, the proposed method identifies non-text objects such as image, drawing and table, as well as text objects such as text line and equation by splitting or grouping segmented regions into composite layout components. Experimental results with 372 images scanned from the IEEE Transactions on Pattern Analysis and Machine Intelligence show that the proposed method has performed geometrical structure analysis successfully on more than 99% of the test images, resulting in sophisticated performance compared with previous works.

문서 영상으로부터 논리적인 구성 요소를 추출하여 전자 문서를 생성하기 위해서는 정교한 수준의 기하적인 구조 분석이 선행되어야 한다. 본 논문은 과학기술 논문을 대상으로 정교한 수준의 기하적인 구조 분석을 지원하기 위하여 지식베이스에 기반한 방법을 제안한다. 제안된 지식베이스는 과학기술 논문 유형이 공통적으로 갖는 기하적인 특성은 물론이고 출판물 특유의 특성에 대한 지식을 규칙 형태로 표현한다. 제안된 방법은 상향식과 하향식의 복합 기법을 사용하며 영역분할과 식별의 두 단계로 구성된다. 일반적으로 영역분할에 의하여 분할된 영역과 레이아웃을 구성하는 복합 객체사이에는 일-대-일의 대응관계가 존재하지 않는다. 따라서 제안된 방법은 분할된 영역을 추가로 분할하거나 통합하면서 이미지, 드로잉, 그리고 테이블 등의 비 텍스트 객체는 물론이고 텍스트 라인이나 수식과 같은 텍스트객체를 식별한다. 제안된 방법의 평가하기 위하여 IEEE Transactions on Pattern Analysis and Machine Intelligence로부터 스캐닝한 372개의 논문영상으로 실험한 결과, 제안된 방법은 99% 이상의 실험 영상에 대한 기하적인 구조 분석에 성공하여 기존 방법에 비해 정교한 수준의 성능을 보였다.

Keywords

References

  1. L. O'Gorman and R. Kasturi, Document Image Analysis, IEEE Computer Society, 1995
  2. G. Nagy, Twenty Years of Document Image Analysis in PAMI, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 22, No.1, pp. 38 62, Jan. 2000 https://doi.org/10.1109/34.824820
  3. A. Yamashita, T. Amano, Y. Hirayama, N. Itoh, S. Katho, T. Mano, and K. Toyokawa, A Document Recognition System and Its Application, IBM Journal of Research and Development, Vol. 40, No.3, pp, 341-352, May 1996
  4. A. K. Jain and n. Yu, Document Representation and Its Application to Page Decomposition, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 20, No, 3, pp. 294-308, Mar. 1998 https://doi.org/10.1109/34.667886
  5. R. M. Haralick, Document Image Understanding: Geometric and Logical Layout, In Proc. Conf. Computer Vision and Pattern Recognition, pp. 385-390, 1994 https://doi.org/10.1109/CVPR.1994.323855
  6. M. Worring and A. W. M. Smeulders, 'Content. based Internet Access to Paper Documents,' Int'l Journal on Document Analysis and Recognition, Vol. 1, No.4, pp. 209-220, 1988 https://doi.org/10.1007/s100320050020
  7. Y. Y. Tang, S. W. Lee, and C. Y. Suen, Automatic Document Processing-A Survey, Pattern Recognition, Vol. 29, No. 12, pp.1931-1952, 1996 https://doi.org/10.1016/S0031-3203(96)00044-1
  8. International Organization for Standardization, Information Processing-Text and Office Systems -Standard Generalized Markup Language (SGML). ISO/IEC 8879, 1986
  9. World Wide Web Consortium, Extensible Markup Language (XML) l.0, http://www. w3.org/TR/2000/REC-xml-20001006, 2000
  10. O. Hitz, L. Robadey, and R. Ingold, Analysis of synthetic document images, In Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp.374-377, Bangalore, India, Sep. 1999 https://doi.org/10.1109/ICDAR.1999.791802
  11. P. Lefevre and F. Reynaud. ODIL: an SGML Description Language of the Layout Structure of Documents, In Proc. Third Int'l Conf. Document Analysis and Recognition, pp.480-487, 1995 https://doi.org/10.1109/ICDAR.1995.599040
  12. T. Pavlidis and J. Zhou, Page Segmentation and Classification, CVGIP: Graphical Models and Image Processing, Vol. 54, No.6, pp. 484-496, Nov. 1992 https://doi.org/10.1016/1049-9652(92)90068-9
  13. G. Nagy, S. Seth, and M. Viswanathan, A Prototype Document Image Analysis System for Technical Journals, IEEE Computer, Vol. 25, No. 7, pp. 10-22, July 1992 https://doi.org/10.1109/2.144436
  14. M. Krishnamoorthv, G. Nagy, S. Seth, and M. Viswanathan, Syntactic Segmentation and Labeling of Digitized Pages from Technical journals. IEEE Trans. Pattern Analysis and Machine Intelligence. Vol. 15, No.7, pp. 737-747, July 1993 https://doi.org/10.1109/34.221173
  15. S. Tsujimoto and H. Asada. Major Components of a Complete Text Reading System, Proc. IEEE, Vol. 80, No.7, pp. 1133-1149, July 1992 https://doi.org/10.1109/5.156475
  16. K. C. Fall, C. H. Liu, and Y. K Wang, Segmentation and Classification of Mixed Text/Graphics/Image Documents, Pattern Recognition Letters, Vol. 15, pp.1201-1209, 1994 https://doi.org/10.1016/0167-8655(94)90110-4
  17. T. Saitoh, T. Yamaai, and M. Tachikawa, Document Image Segmentation and Layout Analysis, IEICE Trans. Information and Systems, Vol. En-D, No.7, pp.77S-784, July 1994
  18. L. O'Gorman, The Document Spectrum for Page Layout Analysis, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 15, No. 11, pp. 1162-1173, Nov. 1993 https://doi.org/10.1109/34.244677
  19. D. Wang and S. N. Srihari, Classification of Newspaper Image Blocks Using Texture Analysis, Computer Vision. Graphics, and Image Processing, Vol. 47, pp.327-352, 1989 https://doi.org/10.1016/0734-189X(89)90116-3
  20. F. Cesarini, M. Gori, S. Marinai, and G. Soda, 'Structured Document Segmentation and Representation by the Modified X - Y tree,' In Proc. Fifth Int'l Conf. Document Analysis and Recognition, pp. 563-566, IEEE Computer Society, Bangalore, India, Sep. 1999 https://doi.org/10.1109/ICDAR.1999.791850
  21. L. A. Fletcher and R. Kasturi, A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 10, No.6, pp. 910-918, Nov. 1988 https://doi.org/10.1109/34.9112
  22. A. Antonacopoulos and R. T. Ritchings, 'Flexible Page Segmentation Using the Background,' In Proc. Int. Con! Pattern Recognition, 1994 https://doi.org/10.1109/ICPR.1994.576932
  23. A. K. Jain and S. Bhattacharjee, 'Text Segmentation Using Gabor Flters for Automatic Document Processing,' Machine Vision and App., 5, pp.169-184,1992 https://doi.org/10.1007/BF02626996
  24. K. Etemad, R. Chellappa, and D. Doermann, 'Page Segmentation Using Wavelet Packets and Decision Integration,' In Proc. of Int. Conf. Pattern Recognition, pp. 345-349, 1994 https://doi.org/10.1109/ICPR.1994.576933
  25. A. Antonacopoulos, Page Segmentation Using the Description of the Background, Computer Vision and Image Understanding, Vol. 70, No.3, pp. 350- 369, June 1998 https://doi.org/10.1006/cviu.1998.0691
  26. F. Esposito, D. Malerba, and G. Serneraro, 'A Knowledge- Based Approach to the Layout Analysis,' In Proc. Third Int'l Conf Document Analysis and Recognition, pp. 466-471, 1995 https://doi.org/10.1109/ICDAR.1995.599037
  27. G. Nagy, J. Kanai, M. Krishnarnoorthy, M. Thomas, and M. Viswanathan, 'Two Complementary Techniques for Digitized Document Analysis,' In Proc. ACM Conf. Document Processing Systems, pp. 169-176. 1988 https://doi.org/10.1145/62506.62539
  28. A. Dengel and G. Barth, High Level Document Analysis Guided By Geometric Aspects, Int'l Journal of Pattern Recoanition and Artificial Intelligence, Vol. 2, NO.4. pp. 641-655, 1988 https://doi.org/10.1142/S0218001488000406
  29. A. Dengel, R. Bleisinger, R. Hoch, F. Fein, and F. Hnes, From Paper to Office Document Standard Representation, IEEE Computer. Vol. 25, No.7, pp. 63-67, July 1992 https://doi.org/10.1109/2.144442
  30. J. Higashino, H. Fuiisawa, Y. Nakano, and M. Ejiri, 'A Knowledge-based Segmentation Method for Document Understanding.' In Proc. Eighth Int'l Conf. Pattern Recognition, 745-748, 1986
  31. J. L. Fisher, S. C. Hinds, and D. P. D'Amato, 'A Rule-based System for Document Image Segmentation,' In Proc. Tenth Int'l Conf. Pattern Recognition, pp. 567-572, 1990 https://doi.org/10.1109/ICPR.1990.118166
  32. D. Niyogi and S. N. Srihari, An Integrated Approach to Document Decomposition and Structural Analysis, Int'l Journal of Imaging Systems and Technology, Vol. 7, pp. 330-342, 1996 https://doi.org/10.1002/(SICI)1098-1098(199624)7:4<330::AID-IMA8>3.0.CO;2-9
  33. J. Sauvola, M. Pietikainen, and M. Koivusaari, 'Predictive Coding for Document Layout Characterization,' In Proc. Workshop on Document Image Analysis, pp. 44-50, IEEE Computer Society, June 1997 https://doi.org/10.1109/DIA.1997.627091
  34. A. M. Nazif and 1\1. D. Levine, Low Level Image Segmentation: An Expert System, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 6, No.5, pp. 555-577, Sep, 1984
  35. J. K Ha. R. M. Haralick, and I. T, Phillips, 'Document Page Decomposition By The Bounding-box Projection Technique,' In Proc. Third Int'l Conf. Document Analysis and Recognition, Vol. 2, pp. 11l9-1122, Montreal, Canada, Aug. 1995 https://doi.org/10.1109/ICDAR.1995.602115
  36. J. Kanai, 'Text Line Extraction and Baseline Detection,' In Proc. Conf. Intelligent Text and Image Handling(RIAO'91), pp. 194-210, Barcelona, Spain. Apr. 1991
  37. K. H. Lee. S. B. Cho, and Y. C. Choy, 'Automated Vectorization of Cartographic Maps by a Knowledge-based System,' Engineering Applications of Artificial Intelligence, Vol. 13, No. 2, pp. 165-178, Apr. 2000 https://doi.org/10.1016/S0952-1976(99)00049-4
  38. TextBridge Pro Millennium. Peabody, MA: Scansoft Inc., 2000. http://www.scansoft.com
  39. InIT Reader. Seoul, Korea: InIT Co., 2000. http://www.init.co.kr