Browse > Article
http://dx.doi.org/10.5909/JBE.2008.13.4.544

A new approach for overlay text detection from complex video scene  

Kim, Won-Jun (Information and Communications University)
Kim, Chang-Ick (Information and Communications University)
Publication Information
Journal of Broadcast Engineering / v.13, no.4, 2008 , pp. 544-553 More about this Journal
Abstract
With the development of video editing technology, there are growing uses of overlay text inserted into video contents to provide viewers with better visual understanding. Since the content of the scene or the editor's intention can be well represented by using inserted text, it is useful for video information retrieval and indexing. Most of the previous approaches are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background. In this paper, we propose a novel framework to localize the overlay text in a video scene. Based on our observation that there exist transient colors between inserted text and its adjacent background a transition map is generated. Then candidate regions are extracted by using the transition map and overlay text is finally determined based on the density of state in each candidate. The proposed method is robust to color, size, position, style, and contrast of overlay text. It is also language free. Text region update between frames is also exploited to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.
Keywords
Overlay text; transition map; state of density;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. C. Kim et al, "Scene text extraction in natural scene images using hierarchical feature combining and verification," International Conference on Pattern Recognition (ICPR), vol. 2, pp. 697-682, Aug. 2004
2 T. Ojala, M. Pierikainen, and T. Maenpaa, "Multiresolution- gray-scale and rotation invariant texture classification with local binary patterns," IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 24, no. 7, pp. 971-987, July 2002   DOI   ScienceOn
3 X. Liu and J. Samarabandu, "Multiscale edge-based text extraction from complex images," International Conference on Multimedia and Expo (ICME), pp. 1721-1724, July 2006
4 F. Wang and D. P. Landau, "An efficient, multiple range random walk algorithm to calculate the density of states," Physical Review Letters, vol. 86, no. 10, pp. 2050-2053, 2001   DOI   ScienceOn
5 X. S. Hua, P. Yin, and H. J. Zhang, "Efficient video text recognition using multiple frame integration," International Conference on Image Processing (ICIP), vol. 2, pp. 397-400, Sept. 2002
6 T. Sato, T. Kanade, E. K. Hughes, and M. A. Smith, "Video OCR for digital news archive," International Workshop on Content-Based Access of Image and Video Libraries, pp.52-60, Jan. 1998
7 M. Bertini, C. Colombo, and A. Del Bimbo, "Automatic caption localization in videos using salient points," International Conference on Multimedia and Expo (ICME), pp. 68-71, Aug. 2001
8 R. C. Gonzalez and R. E. Woods, Digital Image Processing, Seconded., Upper Saddle River, NJ:Prentice-Hall, 2002
9 L. Agnihotri and N. Dimitrova, "Text detection for video analysis," International Works on Content-Based Access of Image and Video Libraries, pp. 109-113, June 1999
10 M. R. Lyu, J. Song, and M. Cai, "A comprehensive method for multilingual video text detection, localization, and extraction," IEEE Transaction on Circuit and Systems for Video Technology (CSVT), vol. 15, no. 2, pp. 243-255, Feb. 2005   DOI   ScienceOn
11 J. Gllavata, R. Ewerth, and B. Freisleben, "Text detection in images based on unsupervised classification of high-frequency wavelet coefficients," International Conference on Pattern Recognition (ICPR), vol. 1, pp. 425-428, Aug. 2004
12 C. Liu, C. Wang, and R. Dai, "Text detection in images based on unsupervised classification of edge-based features," International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 610-614, Sept. 2005
13 J. M. Pike and C. G. Harris, "A combined corner and edge detector," Proceedings of the fourth Alvey Vision Conference, pp.147-151, 1988