일반화된 문자 및 비디오 자막 영역 추출 방법

A Generalized Method for Extracting Characters and Video Captions

  • 전병태 (한국전자통신연구원 컴.소.연 영상처리연구부) ;
  • 배영래 (한국전자통신연구원 컴.소.연 영상처리연구부) ;
  • 김태윤 (고려대학교 컴퓨터학과)
  • 발행 : 2000.06.15

초록

기존의 문자 영역 추출 방법은 전체 영상에 대하여 컬러 축소(color reduction), 영역 분할 및 합병(region split and merge), 질감 분석(texture analysis)등과 같은 방법을 이용하여 문자 영역을 추출했다. 이 방법들은 많은 휴우리스틱(heuristic) 변수와 추출하고자 하는 문자의 사전 지식에 의해 임계치 값을 설정함으로서 알고리즘을 일반화하기 어렵다는 문제점이 있다. 본 논문에서는 문자의 지형학적 특징점 추출 방법과 점-선-면 확장법을 이용하여 문자 영역을 추출함으로서 기존 문자 영역 추출의 문제점인 휴우리스틱 변수의 사용을 최소화하고 임계치 값을 일반화함으로 서 일반화된 문자 영역 추출 방법을 제안 하고자 한다. 실험결과 일반화된 변수와 임계값을 사용함으로서 문자의 사전 지식 없이도 문자 영역을 추출함을 볼 수 있었다. 비디오 영상의 경우 후보 영역 추출율 100%, 검증을 통한 자막 영역 추출율은 98% 이상임을 볼 수 있었다.

Conventional character extraction methods extract character regions using methods such as color reduction, region split and merge and texture analysis from the whole image. Because these methods use many heuristic variables and thresholding values derived from a priori knowledge, it is difficult to generalize them algorithmically. In this paper, we propose a method that can extract character regions using a topographical feature extraction method and a point-line-region extension method. The proposed method can also solve the problems of conventional methods by reducing heuristic variables and generalizing thresholding values. We see that character regions can be extracted by generalized variables and thresolding values without using a priori knowledge of character region. Experimental results show that the candidate region extraction rate is 100%, and the character region extraction rate is over 98%.

키워드

참고문헌

  1. HongJiang Zhang, Shuang Yeo Tan, Stephen W. Smoliar and Gong Yihong,'Automatic parsing and indexing of news video,' Multimedia System, Vol.2, pp.256-266, 1995 https://doi.org/10.1007/BF01225243
  2. HongJiang Zhang, C.Y.Low, S.W.Smoliar and J.H.Wu, 'Video Parsing, Retrieval and Browsing : An Integrated and Content-based Solution,' Proc. ACM Multimedia 95, San Francisco, CA, pp.15-24, Nov. 1995 https://doi.org/10.1145/217279.215068
  3. Michael A. Smith, and Takeo Kanade, 'Video Skimming for Quick Browsing based on Audio and Image Characterization,' Carnegie Mellon Univ., Technical Report CMU-CS-95-186, July 1995
  4. Jun Ohya, Akio Shio and Shigeru Akamatsu,'Recognizing Characters in Scene Images,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.16, No.2, pp.214-220, 1994 https://doi.org/10.1109/34.273729
  5. B. Yu and A. Jain, 'Robust and fast skew detection algorithm for generic documentation,' Pattern Recognition, Vol.29, pp.1599-1629, 1996 https://doi.org/10.1016/0031-3203(96)00020-9
  6. B. Yu and A.jain and M. Mohiuddin, 'Address block location on complex mail piecies,' Proc. of the 4th Int. Conf. on Document Analysis and Recognition, Ulm, 1997 https://doi.org/10.1109/ICDAR.1997.620641
  7. A. Jain and S. Bhattacharjee, 'Text segmentation using Gabor filters for automatic document processing,' Machine Vision Application, Vol.5, pp.169-184, 1992
  8. I. Pitas and C. Kotropoulos,'A textured-based approach to the segmentation of semitic image,' Pattern Recognition. Vol.25, pp.929-945, 1992 https://doi.org/10.1016/0031-3203(92)90059-R
  9. Y. Zhong, K. Karu, and A. Jain,'Locating text in complex color images,' Pattern Recognition, vol.28, pp.1523-1535, 1995 https://doi.org/10.1016/0031-3203(95)00030-4
  10. Rainer Lienhart and Frank Stuber, 'Automatic text recognition in digital videos,' Proc. of the SPIE, Image and Video Processing IV, Vol. SPIE2666, pp.180-188, San Jose, 1996 https://doi.org/10.1117/12.234741
  11. Shoji Kurakake, Hidetaka Kuwano and kazumi Odaka, 'Recognition and visual feature matching of text region in video for conceptual indexing,' Proc. of the SPIE, Storage and Retrieval for Image and Video Database V, vol. SPIE3022, pp.368-378, San Jose, 1997 https://doi.org/10.1117/12.263425
  12. Yu Zhong, Kalle Karu and Anil K. Jain, 'Location text in complex color images,' Pattern Recognition, Vol.28, No.10, pp.1523-1535, 1995 https://doi.org/10.1016/0031-3203(95)00030-4
  13. Hae-Kwang Kim, 'Efficient automatic text location method and content-based indexing and structuring of video database,' Journal of Visual Communication and Image Representation, Vol.7, No.4, pp.336-344, Dec. 1996 https://doi.org/10.1006/jvci.1996.0029
  14. Jun Ohya, Akio Shio and Shigeru Akamatsu, 'Recognizing characters in scene images,' IEEE Transactions on Pattern Analysis and Machine Intellignece, Vol.16, No.2, pp.214-220, 1994 https://doi.org/10.1109/34.273729
  15. Byung Tae Chun, Younglae Bae and Tai-Yun Kim, 'Text Extraction in Videos using Topographical Features of Characters,' The 8th IEEE Int. Conf. on Fuzzy System(Fuzz-IEEE'99), Vol.2, pp.1126-1130, Seoul, 1999 https://doi.org/10.1109/FUZZY.1999.793113