• Title/Summary/Keyword: caption extraction

Search Result 33, Processing Time 0.03 seconds

Web Image Caption Extraction using Positional Relation and Lexical Similarity (위치적 연관성과 어휘적 유사성을 이용한 웹 이미지 캡션 추출)

  • Lee, Hyoung-Gyu;Kim, Min-Jeong;Hong, Gum-Won;Rim, Hae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.335-345
    • /
    • 2009
  • In this paper, we propose a new web image caption extraction method considering the positional relation between a caption and an image and the lexical similarity between a caption and the main text containing the caption. The positional relation between a caption and an image represents how the caption is located with respect to the distance and the direction of the corresponding image. The lexical similarity between a caption and the main text indicates how likely the main text generates the caption of the image. Compared with previous image caption extraction approaches which only utilize the independent features of image and captions, the proposed approach can improve caption extraction recall rate, precision rate and 28% F-measure by including additional features of positional relation and lexical similarity.

Caption Region Extraction of Sports Video Using Multiple Frame Merge (다중 프레임 병합을 이용한 스포츠 비디오 자막 영역 추출)

  • 강오형;황대훈;이양원
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.4
    • /
    • pp.467-473
    • /
    • 2004
  • Caption in video plays an important role that delivers video content. Existing caption region extraction methods are difficult to extract caption region from background because they are sensitive to noise. This paper proposes the method to extract caption region in sports video using multiple frame merge and MBR(Minimum Bounding Rectangles). As preprocessing, adaptive threshold can be extracted using contrast stretching and Othu Method. Caption frame interval is extracted by multiple frame merge and caption region is efficiently extracted by median filtering, morphological dilation, region labeling, candidate character region filtering, and MBR extraction.

  • PDF

Size-Independent Caption Extraction for Korean Captions with Edge Connected Components

  • Jung, Je-Hee;Kim, Jaekwang;Lee, Jee-Hyong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.4
    • /
    • pp.308-318
    • /
    • 2012
  • Captions include information which relates to the images. In order to obtain the information in the captions, text extraction methods from images have been developed. However, most existing methods can be applied to captions with a fixed height or stroke width using fixed pixel-size or block-size operators which are derived from morphological supposition. We propose an edge connected components based method that can extract Korean captions that are composed of various sizes and fonts. We analyze the properties of edge connected components embedding captions and build a decision tree which discriminates edge connected components which include captions from ones which do not. The images for the experiment are collected from broadcast programs such as documentaries and news programs which include captions with various heights and fonts. We evaluate our proposed method by comparing the performance of the latent caption area extraction. The experiment shows that the proposed method can efficiently extract various sizes of Korean captions.

EXTRACTION OF DTV CLOSED CAPTION STREAM AND GENERATION OF VIDEO CAPTION FILE

  • Kim, Jung-Youn;Nam, Je-Ho
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.364-367
    • /
    • 2009
  • This paper presents a scheme that generates a caption file by extracting a Closed Caption stream from DTV signal. Note that Closed-Captioning service helps to bridge "digital divide" through extending broadcasting accessibility of a neglected class such as hearing-impaired person and foreigner. In Korea, DTV Closed Captioning standard was developed in June 2007, and Closed Captioning service should be supported by an enforcing law in all broadcasting services in 2008. In this paper, we describe the method of extracting a caption data from MPEG-2 Transport Stream of ATSC-based digital TV signal and generating a caption file (SAMI and SRT) using the extracted caption data and time information. Experimental results verify the feasibility of a generated caption file using a PC-based media player which is widely used in multimedia service.

  • PDF

A Method for Caption Segmentation using Minimum Spanning Tree

  • Chun, Byung-Tae;Kim, Kyuheon;Lee, Jae-Yeon
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.906-909
    • /
    • 2000
  • Conventional caption extraction methods use the difference between frames or color segmentation methods from the whole image. Because these methods depend heavily on heuristics, we should have a priori knowledge of the captions to be extracted. Also they are difficult to implement. In this paper, we propose a method that uses little heuristics and simplified algorithm. We use topographical features of characters to extract the character points and use KMST(Kruskal minimum spanning tree) to extract the candidate regions for captions. Character regions are determined by testing several conditions and verifying those candidate regions. Experimental results show that the candidate region extraction rate is 100%, and the character region extraction rate is 98.2%. And then we can see the results that caption area in complex images is well extracted.

  • PDF

Connected Component-Based and Size-Independent Caption Extraction with Neural Networks (신경망을 이용한 자막 크기에 무관한 연결 객체 기반의 자막 추출)

  • Jung, Je-Hee;Yoon, Tae-Bok;Kim, Dong-Moon;Lee, Jee-Hyong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.7
    • /
    • pp.924-929
    • /
    • 2007
  • Captions which appear in images include information that relates to the images. In order to obtain the information carried by captions, the methods for text extraction from images have been developed. However, most existing methods can be applied to captions with fixed height of stroke's width. We propose a method which can be applied to various caption size. Our method is based on connected components. And then the edge pixels are detected and grouped into connected components. We analyze the properties of connected components and build a neural network which discriminates connected components which include captions from ones which do not. Experimental data is collected from broadcast programs such as news, documentaries, and show programs which include various height caption. Experimental result is evaluated by two criteria : recall and precision. Recall is the ratio of the identified captions in all the captions in images and the precision is the ratio of the captions in the objects identified as captions. The experiment shows that the proposed method can efficiently extract captions various in size.

An Effective Method for Replacing Caption in Video Images (비디오 자막 문자의 효과적인 교환 방법)

  • Chun Byung-Tae;Kim Sook-Yeon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.2 s.34
    • /
    • pp.97-104
    • /
    • 2005
  • Caption texts frequently inserted in a manufactured video image for helping an understanding of the TV audience. In the movies. replacement of the caption texts can be achieved without any loss of an original image, because the caption texts have their own track in the films. To replace the caption texts in early methods. the new texts have been inserted the caption area in the video images, which is filled a certain color for removing established caption texts. However, the use of these methods could be lost the original images in the caption area, so it is a Problematic method to the TV audience. In this Paper, we propose a new method for replacing the caption text after recovering original image in the caption area. In the experiments. the results in the complex images show some distortion after recovering original images, but most results show a good caption text with the recovered image. As such, this new method is effectively demonstrated to replace the caption texts in video images.

  • PDF

Methods for Video Caption Extraction and Extracted Caption Image Enhancement (영화 비디오 자막 추출 및 추출된 자막 이미지 향상 방법)

  • Kim, So-Myung;Kwak, Sang-Shin;Choi, Yeong-Woo;Chung, Kyu-Sik
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.4
    • /
    • pp.235-247
    • /
    • 2002
  • For an efficient indexing and retrieval of digital video data, research on video caption extraction and recognition is required. This paper proposes methods for extracting artificial captions from video data and enhancing their image quality for an accurate Hangul and English character recognition. In the proposed methods, we first find locations of beginning and ending frames of the same caption contents and combine those multiple frames in each group by logical operation to remove background noises. During this process an evaluation is performed for detecting the integrated results with different caption images. After the multiple video frames are integrated, four different image enhancement techniques are applied to the image: resolution enhancement, contrast enhancement, stroke-based binarization, and morphological smoothing operations. By applying these operations to the video frames we can even improve the image quality of phonemes with complex strokes. Finding the beginning and ending locations of the frames with the same caption contents can be effectively used for the digital video indexing and browsing. We have tested the proposed methods with the video caption images containing both Hangul and English characters from cinema, and obtained the improved results of the character recognition.

Extraction of Superimposed-Caption Frame Scopes and Its Regions for Analyzing Digital Video (비디오 분석을 위한 자막프레임구간과 자막영역 추출)

  • Lim, Moon-Cheol;Kim, Woo-Saeng
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.11
    • /
    • pp.3333-3340
    • /
    • 2000
  • Recently, Requnremeni for video data have been increased rapidly by high progress of both hardware and cornpression technique. Because digital video data are unformed and mass capacity, it needs various retrieval techniquesjust as contednt-based rehieval Superimposed-caption ina digital video can help us to analyze the video story easier and be used as indexing information for many retrieval techniques In this research we propose a new method that segments the caption as analyzing texture eature of caption regions in each video frame, and that extracts the accurate scope of superimposed-caption frame and its key regions and color by measunng cominuity of caption regions between frames

  • PDF

Detection of Artificial Caption using Temporal and Spatial Information in Video (시·공간 정보를 이용한 동영상의 인공 캡션 검출)

  • Joo, SungIl;Weon, SunHee;Choi, HyungIl
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.1 no.2
    • /
    • pp.115-126
    • /
    • 2012
  • The artificial captions appearing in videos include information that relates to the videos. In order to obtain the information carried by captions, many methods for caption extraction from videos have been studied. Most traditional methods of detecting caption region have used one frame. However video include not only spatial information but also temporal information. So we propose a method of detection caption region using temporal and spatial information. First, we make improved Text-Appearance-Map and detect continuous candidate regions through matching between candidate-regions. Second, we detect disappearing captions using disappearance test in candidate regions. In case of captions disappear, the caption regions are decided by a merging process which use temporal and spatial information. Final, we decide final caption regions through ANNs using edge direction histograms for verification. Our proposed method was experienced on many kinds of captions with a variety of sizes, shapes, positions and the experiment result was evaluated through Recall and Precision.