[KSCI] Korea Science Citation Index Service

Automatic Text Extraction from News Video using Morphology and Text Shape

Jang, In-Young (Dept.of Computer Science, Yonsei University)
Ko, Byoung-Chul (Dept.of Computer Science, Yonsei University)
Kim, Kil-Cheon (Dept.of Computer Science, Yonsei University)
Byun, Hye-Ran (Dept.of Computer Science, Yonsei University)

Publication Information

Journal of KIISE:Computing Practices and Letters / v.8, no.4, 2002 , pp. 479-488 More about this Journal

Abstract

In recent years the amount of digital video used has risen dramatically to keep pace with the increasing use of the Internet and consequently an automated method is needed for indexing digital video databases. Textual information, both superimposed and embedded scene texts, appearing in a digital video can be a crucial clue for helping the video indexing. In this paper, a new method is presented to extract both superimposed and embedded scene texts in a freeze-frame of news video. The algorithm is summarized in the following three steps. For the first step, a color image is converted into a gray-level image and applies contrast stretching to enhance the contrast of the input image. Then, a modified local adaptive thresholding is applied to the contrast-stretched image. The second step is divided into three processes: eliminating text-like components by applying erosion, dilation, and (OpenClose+CloseOpen)/2 morphological operations, maintaining text components using (OpenClose+CloseOpen)/2 operation with a new Geo-correction method, and subtracting two result images for eliminating false-positive components further. In the third filtering step, the characteristics of each component such as the ratio of the number of pixels in each candidate component to the number of its boundary pixels and the ratio of the minor to the major axis of each bounding box are used. Acceptable results have been obtained using the proposed method on 300 news images with a recognition rate of 93.6%. Also, my method indicates a good performance on all the various kinds of images by adjusting the size of the structuring element.

최근 들어 인터넷 사용의 증가와 더불어 디지털 비디오의 수요 또한 급격히 증가하고 있는 추세이다. 따라서 디지털 비디오 데이타베이스의 인덱싱을 위한 자동화된 도구가 필요하게 되었다. 디지털비디오 영상에 인위적으로 삽입되어진 문자와 배경에 자연적으로 포함되어진 배경문자 등의 문자 정보는 이러한 비디오 인덱싱을 위한 중요한 단서가 되어질 수 있다. 본 논문에서는 뉴스 비디오의 정지 영상에서 뉴스 자막과 배경 문자를 추출하기 위한 새로운 방법을 제안한다. 제안된 알고리즘은 다음과 같이 세 단계로 구성된다. 첫 번째 전처리 단계에서는 입력된 컬러 영상을 명도 영상으로 변환하고, 히스토그램 스트레칭을 적용하여 영상의 수준을 향상시킨다. 이 영상에 적응적 임계값 추출에 의한 분할 방법을 수정 적용하여 영상을 분할한다. 두 번째 단계에서는 적응적 이진화가 적용된 결과 영상에 모폴로지 연산을 적절하게 사용하여, 우선 문자 영역은 아니면서 문자로 판단되기 쉬운 양의 오류(false-positive) 요소들이 강조되어 남아있는 영상을 만든다. 또한, 변형된 이진화 결과 영상에 모폴로지 연산과 본 논문에서 제안한 기하학적 보정(Geo-corrertion) 필터링 방법을 적용하여 문자와 문자로 판단되기 쉬운 요소들이 모두 강조되어 남아있는 영상을 만든다. 이 두 영상의 차를 구함으로서 찾고자 하는 문자 요소들이 주로 남고, 문자가 아닌 문자처럼 보이는 오류 요소들은 대부분 제거된 결과 영상을 만든다. 문자로 판단되는 양의 오류 영역들을 남기는데 사용된 모폴로지 연산은 3 $\times$ 3 크기의 구조 요소를 갖는 열림과 (열림닫힘+닫힘열림)/2 이며, 문자 및 문자와 유사한 요소들을 남기는데 사용된 연산은 (열림닫힘+닫힘열림)/2와 기하학적 보정이다. 세 번째 검증 단계에서는 전체 영상 화소수 대비 각 후보 문자 영역의 화소수 비율, 각 후보 문자 영역의 전체 화소수 대비 외곽선의 화소수 비율, 각 외곽 사각형의 폭 대 높이간의 비율 등을 고려하여 비문자로 판단되는 요소들을 제거한다. 임의의 300개의 국내 뉴스 영상을 대상으로 실험한 결과 93.6%의 문자 추출률을 얻을 수 있었다. 또한, 본 논문에서 제안한 방법으로 국외 뉴스, 영화 비디오 등의 영상에서도 좋은 추출을 보임을 확인할 수 있었다.

Keywords

Text extraction; Video indexing; Morphology;

Citations & Related Records

Reference

1	Jae-Chang Shim, Chitra Dorai, Ruud Bolle, 'Automatic Text Extraction from Video for Content-Based Annotation and Retrieval,' Pattern Recognition, 1998 Proceedings. Fourteenth International Conference on, On page(s): 618-620, vol.1 16-20 Aug. 1998 DOI
2	Anil K. Jain, Bin Yu, 'Automatic text location in images and video frames,' Pattern Recognition, Vol. 31, No. 12, pp. 2055-2076, 1998 DOI ScienceOn
3	H.Kuwano, Y.Taniguchi, H.Arai, M.Mori, S.Kuraka-ke, H.Kojima, 'Telop-on-demand: video structuring and retrieval base on text recognition,' Multimedia and Expo, 2000 ICME 2000, 2000 IEEE International Conference on, On page(s): 759-762, vol.2, 30 July-2 Aug. 2000 DOI
4	Sameer Antani, Ullas Gargi, David Crandall, Tarak Gandhi and Rangachar Kasturi, 'Extraction of Text in Video,' Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., Technical Report, CSE-99-016, August 30, 1999
5	S. Antani, D. Crandall, R. Kasturi, 'Robust extraction of text in video,' Pattern Recognition, 2000 Proceedings. 15th International Conference on, Volume: 1, 2000, Page(s): 831-834 vol.1 DOI
6	S. Messelodi and C.M. Modena, 'Automatic identification and skew estimation of text lines in real scene images,' Pattern Recognition, Vol. 32 (5) (1999) pp. 791-810 DOI ScienceOn
7	U. Gargi, S. Antani, R. Kasturi, 'Indexing text events in digital video databases,' Pattern Recognition, 1998 Proceedings. Fourteenth International Conference on, On page(s): 916-918 vol.1 16-20 Aug. 1998 DOI
8	J. Ohya, A. Shio, S. Akamatsu, 'Recognizing characters in scene image,' IEEE Trans. Pattern Anal. Mach. Intell. PAMI-16(2) (1994) 214-220 DOI ScienceOn
9	C. M. Lee, A. Kankanhalli, 'Autometic extraction of characters in complex images,' Int. J. Pattern Recognition Artificial Intell. ( (1) (1995) 67-82 DOI ScienceOn
10	H. K. Kim, 'Efficient automatic text location method and content-based indexing and structuring of video database,' J. Visual Commun. Image Representation 7 (4) (1996) 336-344 DOI ScienceOn
11	Y. Lu, 'Machine printed character segmentation- An overview,' Pattern Recognition, 28, 1995, 67-80 DOI ScienceOn
12	J. Serra, Image Analysis and Mathematical Morphology. New York: Academic, 1982
13	R. Lienhart, F. Stuber, 'Autometic text recognition in digital videos,' Imege and Video Proceeding IV 1996, SPIE 2666-20, 1996 DOI
14	M. A. Smith, T. Kanade, 'Video skimming for quick browsing base on audio and image characterization,' Technical Report CMU-CS-95-186, Carnegie Mellon University, July 1995
15	Y. Zhong, K. Karu, A. K. Jain, 'Locating text in complex color images,' Pattern Recognition, 28 (10) (1995) 1523-1535 DOI ScienceOn
16	F. Lebourgeois, 'Robust Multifont OCR System from Gray Level Image,' in International Conference on Document Analysis and Recognition, vol. 1, pp.1-5, 1997 DOI
17	Pyeoung-Kee Kim, 'Automatic Text Location in Complex Color Images using Local Color Quantization,' TENCON 99. Proceedings of the IEEE Region 10 Conference, Volume: 1, pp. 629-632, 1999 DOI
18	M. Bertini, C. Colombo, A. Del Isimbo, 'Automatic Caption Localization in Video using Salient Points,' IEEE Int. Conf. On Multimedia and Expo. pp69-72, 2001

KSCI

Automatic Text Extraction from News Video using Morphology and Text Shape 형태학과 문자의 모양을 이용한 뉴스 비디오에서의 자동 문자 추출

Automatic Text Extraction from News Video using Morphology and Text Shape