복잡한 영상 내의 문자영역 추출을 위한 텍스춰와 연결성분 방법의 결합

Hybrid Approach of Texture and Connected Component Methods for Text Extraction in Complex Images

  • 정기철 (숭실대학교 정보과학대학 미디어학부)
  • 발행 : 2004.11.01


본 논문은 복잡한 컬러 영상에서의 문자 추출을 위한 텍스춰와 연결성분 방법의 결합된 방법을 제안한다. 자동 학습 방법으로 구축된 다층 신경망(multilayer perceptron)은 부트스트랩 학습 방법을 사용함으로써 별도의 특징값 추출 단계 없이 다양한 환경의 입력 영상에 대한 검출률(recall rate)을 향상시키며, 검출률을 향상함으로써 발생되는 정확도(precision rate) 저하 문제는, NMF(Non-negative matrix factorization)를 이용한 연결 성분 방법을 사용함으로써 극복한다. 문자의 존재 비율이 낮은 입력영상에 대하여 CAMShift 알고리즘을 이용한 영역 마킹 방법을 사용함으로써, 두 방법을 결합함으로써 야기되는 속도 저하 문제의 해결을 시도하였다. 이와 같이 텍스춰와 연결성분 방법을 결합함으로써 강건하고 효율적인 시스템을 구성할 수 있었다.

We present a hybrid approach of texture-based method and connected component (CC)-based method for text extraction in complex images. Two primary methods, which are mainly utilized in this area, are sequentially merged for compensating for their weak points. An automatically constructed MLP-based texture classifier can increase recall rates for complex images with small amount of user intervention and without explicit feature extraction. CC-based filtering based on the shape information using NMF enhances the precision rate without affecting overall performance. As a result, a combination of texture and CC-based methods leads to not only robust but also efficient text extraction. We also enhance the processing speed by adopting appropriate region marking methods for each input image category.



  1. Rainer Lienhart and Frank Stuber, 'Automatic Text Recognition In Digital Videos,' SPIE-The International Society for Optical Engineering, pp. 180-188, 1996 https://doi.org/10.1117/12.234741
  2. Hae-Kwang Kim, 'Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database,' Journal of visual communication and image representation, Vol. 7, No.4, December, pp. 336-344, 1996 https://doi.org/10.1006/jvci.1996.0029
  3. Huiping Li, David Doerman, and Omid Kia, 'Automatic Text Detection and Tracking in Digital Video', IEEE Transactions on Image Processing, Vol. 9, No. 1, pp.147-156, 2000 https://doi.org/10.1109/83.817607
  4. Yu Zhong, Hongjiang Zhang, and Anil K. Jain, 'Automatic Caption Localization in Compressed Video,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 4, 2000 https://doi.org/10.1109/34.845381
  5. Anil. K. Jain, and Bin Yu, 'Automatic Text Location in Images and Video Frames,' Pattern Recognition, Vol. 31, No. 12, pp. 2055-2076, 1998 https://doi.org/10.1016/S0031-3203(98)00067-3
  6. E.Y. Kim, K.Jung, K.Y.Jeong, and H.J.Kim, 'Automatic Text Region Extraction Using Cluster-based Templates,' International Conference on Advances in Pattern Recognition and Digital Techniques, pp. 418-421, 2000
  7. Yu Zhong, Kalle Karu, and Anil K. Jain, 'Locating Text in Complex Color Images,' Pattern Recognition, Vol. 28. No. 10, pp. 1523-1535, 1995 https://doi.org/10.1016/0031-3203(95)00030-4
  8. K. Y. Jeong, K. Jung, E. Y. Kim and H. J. Kim, 'Neural Network-based Text Location for News Video Indexing,'Proceedings of International Conference of Image Processing, 1999 https://doi.org/10.1109/ICIP.1999.817127
  9. Yassin M. Y. Hasan and Lina J. Karam, 'Morphological Text Extraction from Images,' IEEE Transactions on Image Processing, Vol. 9, No. 11, pp. 1978-1983, 2000 https://doi.org/10.1109/83.877220
  10. S. Messelodi and C. M. Modena, 'Automatic Identifacation and Skew Estimation of Text Lines in Real Scene Images,' Pattern Recognition, Vol. 32, pp. 791-810, 1999 https://doi.org/10.1016/S0031-3203(98)00108-3
  11. Victor Wu, Raghavan Manmatha, and Edward M. Riseman, 'TextFinder: An Automatic System to Detect and Recognize Text in Images,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 21, No. 11, pp. 1224-1229, 1999 https://doi.org/10.1109/34.809116
  12. C. Strouthopoulos and N.Papamarkos, 'Text Identification For Document Image Analysis Using a Neural Network,' Image and Vision Computing, Vol. 16, pp. 879-896, 1998 https://doi.org/10.1016/S0262-8856(98)00055-9
  13. Keechul Jung, 'Neural Network-based Text Location using Color Texture Discrimination,' PhD. Thesis, Artificial Intelligence Laboratory, Kyungpook National University, Korea, December 1999
  14. Huiping Li and David Doermann, 'A Video Text Detect System based on Automated Training,' International Conference on Pattern Recognition, pp.223-226, 2000 https://doi.org/10.1109/ICPR.2000.906053
  15. Axel Wernicle and Rainer Lienhart, 'On the Segmentation of Text in Videos,' IEEE International Conference on Multimedia and Expo, Vol. 3, pp. 1511-1514, 2000 https://doi.org/10.1109/ICME.2000.871054
  16. Ullas Gargi, Sameer Antani, and Rangachar Kasturi, 'Indexing Text Events in Digital Video Database,' International Conference on Pattern Recognition, pp. 1481-1483, 1998 https://doi.org/10.1109/ICPR.1998.711301
  17. K. K. Sung and T. Poggio, 'Example-based learning for view-based human face detection,' IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 39-51, 1998 https://doi.org/10.1109/34.655648
  18. Anil K. Jain and Kalle Karu, 'Learning Texture Discrimination Masks,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, No.2, pp. 195-205, 1996 https://doi.org/10.1109/34.481543
  19. Yizong Cheng, 'Mean Shift, Mode Seeking, and Clustering,' IEEE Transacions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 8, August, pp.790-799, 1995 https://doi.org/10.1109/34.400568
  20. Gary R. Bradski and Vadim Pisarevsky, 'Intel's Computer Vision Library: Application in Calibration, Stereo, Segmentation, Tracking, Gesture, Face and Object Recognition,' Proceedings of IEEE Conference of Computer Vision and Pattern Recognition, Vol. 2, pp. 796-797, 2000 https://doi.org/10.1109/CVPR.2000.854964
  21. Dorin Comaniciu and Visvanathan Ramesh, 'Robust Detection and Tracking of Human Faces with an Active Camera,' The 3rd IEEE International Workshop on Visual Surveillance, pp.11-18, 2000 https://doi.org/10.1109/VS.2000.856853
  22. Sameer Antani, Ullas Gargi, David Crandall, Tarak Gandhi, and Rangachar Kasturi, 'Extraction of Text in Video,' Technocal Report, CSE -99-016, August 30, 1999
  23. B.K.P. Horn, Robot Vision. MIT Press, 1986
  24. Rainer Lienhart and Frank Stuber, 'Automatic Text Recognition In Digital Videos,' SPIE-The International Society for Optical Engineering, pp. 180-188, 1996 https://doi.org/10.1117/12.234741
  25. Hae-Kwang Kim, 'Efficient Automatic Text Location Method and Content-Based Indexing and Structuring of Video Database,' Journal of Visual Communication and Image Representation, Vol. 7, No. 4, December, pp. 336-344, 1996 https://doi.org/10.1006/jvci.1996.0029
  26. Huiping Li, David Doerman, and Omid Kia, 'Automatic Text Detection andTracking in Digital Video,' IEEE Transactions on Image Processing, Vol. 9, No. 1, January, pp.147-156, 2000 https://doi.org/10.1109/83.817607
  27. Yu Zhong, Hongjiang Zhang, and Anil K. Jain, 'Automatic Caption Localization in Compressed Video,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 4, pp. 385-392, 2000 https://doi.org/10.1109/34.845381
  28. Anil. K. Jain and Bin Yu, 'Automatic Text Location in Images and Video Frames,' Pattern Recognition, Vol. 31, No. 12, pp.2055-2076, 1998 https://doi.org/10.1016/S0031-3203(98)00067-3
  29. E.Y. Kim, K. Jung, K.Y. Jeong, and H.J. Kim, 'Automatic Text Region Extraction Using Cluster-based Templates,' International Conference on Advances in Pattern Recognition and Digital Techniques, pp. 418-421, 2000
  30. K. Jung, 'Neural network-based Text Location in Color Images,' Pattern Recognition Letters, Vol.22, No.14, pp.1503-1515, 2001 https://doi.org/10.1016/S0167-8655(01)00096-4
  31. 정기철, 김광인, 한정현, '신경망 기반의 텍스춰 분석을 이용한 효율적인 문자 추출', 정보과학회 논문지, Vol. 29, No. 3, pp. 180-191, 2002
  32. D. D. Lee, H. S. Seung, 'Learning the Parts of Objects by Non-Negative Matrix Factorization,' Nature 401, pp. 788-791, 1999 https://doi.org/10.1038/44565
  33. H. S. Seung, 'Derivation of the objective function (Eq.2),' http://jounalclub.mit.edu
  34. D. D. Lee, H. S. Seung, 'Algorithms for non-negative matrix factorization,' In Advances in Neural Information Processing Systems, 13, pp. 556562, 2001
  35. Richard O. Duda, Peter E. Hart, David G. Stork, 'Pattern Classification,' Wiely-Interscience, 2000