A Two-Stage Document Page Segmentation Method using Morphological Distance Map and RBF Network

거리 사상 함수 및 RBF 네트워크의 2단계 알고리즘을 적용한 서류 레이아웃 분할 방법

  • 신현경 (경원대학교 수학정보학과)
  • Published : 2008.09.15

Abstract

We propose a two-stage document layout segmentation method. At the first stage, as top-down segmentation, morphological distance map algorithm extracts a collection of rectangular regions from a given input image. This preliminary result from the first stage is employed as input parameters for the process of next stage. At the second stage, a machine-learning algorithm is adopted RBF network, one of neural networks based on statistical model, is selected. In order for constructing the hidden layer of RBF network, a data clustering technique bared on the self-organizing property of Kohonen network is utilized. We present a result showing that the supervised neural network, trained by 300 number of sample data, improves the preliminary results of the first stage.

본 논문에서는 2 단계 서류 레이아웃 분할 방법을 제안한다. 서류 분할의 1 차 단계는 top-down 계열의 영역 추출로서 모폴로지 기반의 거리 함수를 사용하여 주어진 영상 데이타를 사각형 영역들로 분할한다. 거리 사상 함수를 통한 예비 결과는 성능 개선을 위한 2 차 단계의 입력 변수로 작용한다. 서류 분할의 2차 단계로서 기계 학습 이론을 적용한다. 통계 모델을 따르는 RBF 신경망을 선택하였고, 은닉 층의 설계를 위해 코호넨 네트워크의 자기 조직화 성격을 활용한 데이타 군집화 기법을 기반으로 하였다. 본 논문에서는 300개의 영상에서 추출된 영역 데이타를 통해 학습된 신경망이 1차 단계에서 도출된 예비 결과를 개선함을 연구 결과로 제시하였다.

Keywords

References

  1. Haralick, R. M., "Document Image Understanding: Geometric and Logical Layout," CVPR94: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 385-390, 1994
  2. Ferrer, M., and Valveny, E., "Combination of OCR Engines for Page Segmentation Based on Performance Evaluation," Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), pp. 784-788, Vol. 2, 2007
  3. Wu, V., Manmatha, R. and Riseman, E. M., "Finding Text in Images," ACM, DL, pp. 3-12, 1997
  4. Liang, J., Ha, J., Haralick, R. and Phillips, I., "Document layout structure extraction using bounding boxes of different entities," Proceedings Third IEEE Workshop on Applications of Computer Vision (Sarasota, FL), pp. 278-283, 1996
  5. Gupta, G., Niranjan, S., Shrivastava, A., and Sinha, R.M.K., "Document Layout Analysis and Classification and Its Application in OCR," 10th IEEE International Enterprise Distributed Object Computing Conference Workshops, 2006
  6. Kim, E., Jung, K., Jeong, K., and Kim, H., "Automatic Text Region Extraction Using Cluster- based Templates," Proc. of International Conference on Advances in Pattern Recognition and Digital Techniques, pp. 418-421, 2000
  7. Ha, J., Haralick, R. and Phillips, I., "Recursive X-Y Cut Using Bounding Boxes of Connected Components," Proc. Third Int'l Conf. Document Analysis and Recognition, pp. 952-955, 1995
  8. Shi, J., and Malik, J., "Normalized Cuts and Image Segmentation," IEEE Conference on Computer Vision and Pattern Recognition, pp. 731-737, 1997
  9. Borenstein, E. and Ullman, S., "Class-specific top- down segmentation," In Proc of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pp. 109-124, 2002
  10. Baird, H. S., "Background Structure in Document Images," Document Image Analysis, pp. 17-34, 1994
  11. Antonacopoulos, A., "Page segmentation using the description of the background," Computer Vision and Image Understanding, 70(3):350-369, June, 1998 https://doi.org/10.1006/cviu.1998.0691
  12. Lee, S. and Ryu, B., "Parameter-Free Geometric Document Layout Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1240-1256, Nov. 2001 https://doi.org/10.1109/34.969115
  13. Hassoun, M. H., "Fundamentals of Artificial Neural Networks," The MIT Press, Cambridge, MA, 1995
  14. Rosenfeld, B., Feldman, R., and Aumann, Y., "Structural extraction from visual layout of documents," CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management, pp. 203-210, 2002
  15. Chen, S., Mao, S., and Thoma, G., "Simultaneous Layout Style and Logical Entity Recognition in a Heterogeneous Collection of Documents," Ninth International Conference on Document Analysis and Recognition, Vol. 1, pp. 118-122, 2007
  16. Bishop, C. M., "Neural Networks for Pattern Recognition," Oxford University Press, Oxford, UK, 2003
  17. Haykin, S., "Neural Networks - A Comprehensive Foundation," Prentice Hall, Upper Saddle River, NJ, 1999
  18. Kohonen, T, "Exploration of very large databases by self-organizing maps," 1997 International Conference on Neural Networks, vol. I, pp. PL1-PL6, Houston, 1997
  19. Kohonen, T., "Self-Organizing Maps," 2nd edition, Springer-Verlag, Berlin, 1997