• Title/Summary/Keyword: 자막 추출

Search Result 82, Processing Time 0.02 seconds

The Color Polarity Method for Binarization of Text Region in Digital Video (디지털 비디오에서 문자 영역 이진화를 위한 색상 극화 기법)

  • Jeong, Jong-Myeon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.9
    • /
    • pp.21-28
    • /
    • 2009
  • Color polarity classification is a process to determine whether the color of text is bright or dark and it is prerequisite task for text extraction. In this paper we propose a color polarity method to extract text region. Based on the observation for the text and background regions, the proposed method uses the ratios of sizes and standard deviations of bright and dark regions. At first, we employ Otsu's method for binarization for gray scale input region. The two largest segments among the bright and the dark regions are selected and the ratio of their sizes is defined as the first measure for color polarity classification. Again, we select the segments that have the smallest standard deviation of the distance from the center among two groups of regions and evaluate the ratio of their standard deviation as the second measure. We use these two ratio features to determine the text color polarity. The proposed method robustly classify color polarity of the text. which has shown by experimental result for the various font and size.

Enhancement of Membrane Durability in PEMFC by Fucoidan and Tannic Acid (후코이단과 탄닌산에 의한 PEMFC 고분자막의 내구성 향상)

  • Mihwa Lee;Sohyeong Oh;Cheun-Ho Chu;Young-Sook Kim;Il-Chai Na;Kwonpil Park
    • Korean Chemical Engineering Research
    • /
    • v.61 no.1
    • /
    • pp.45-51
    • /
    • 2023
  • In order to improve the durability of the PEMFC(Proton Exchange Membrane Fuel Cells) polymer membrane, a radical scavenger and a support are used. In this study, the durability of membranes containing fucoidan extracted from seaweeds and tannic acid serving as a crosslinking agent is evaluated to improve chemical and physical durability. Physical durability is evaluated by measuring tensile strength, and chemical durability is measured by Fenton experiment. Membrane and electrode assembly (MEA) is prepared and mechanical and chemical durability are measured through accelerated durability evaluation in the cell. The tensile strength measurement showed that fucoidan and tannic acid can improve the mechanical durability of the membrane by improving the strain rate and yield strength. It is shown in Fenton experiment that fucoidan acts as a radical scavenger. As a result of the accelerated durability test in the unit cell, fucoidan improved both chemical and mechanical durability, increasing the accelerated durability evaluation time by 38.1% compared to the additive-free membrane. When tannic acid is added, the durability of the polymer membrane is improved by 13.9% by improving the mechanical durability.

Contextual In-Video Advertising Using Situation Information (상황 정보를 활용한 동영상 문맥 광고)

  • Yi, Bong-Jun;Woo, Hyun-Wook;Lee, Jung-Tae;Rim, Hae-Chang
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.8
    • /
    • pp.3036-3044
    • /
    • 2010
  • With the rapid growth of video data service, demand to provide advertisements or additional information with regard to a particular video scene is increasing. However, the direct use of automated visual analysis or speech recognition on videos virtually has limitations with current level of technology; the metadata of video such as title, category information, or summary does not reflect the content of continuously changing scenes. This work presents a new video contextual advertising system that serves relevant advertisements on a given scene by leveraging the scene's situation information inferred from video scripts. Experimental results show that the use of situation information extracted from scripts leads to better performance and display of more relevant advertisements to the user.

Detecting of start/end point for TV content reprocessing (방송 콘텐츠의 재가공을 위한 시작.종료점 검출)

  • Yoon, Jeong-Hyun;Kim, Cheon-Seog
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.303-306
    • /
    • 2010
  • DMB, IPTV 등의 미디어에서 방송 서비스를 위해 다수의 기송출된 지상파 방송 프로그램을 재가공하여 활용한다. 이를 위한 작업에서 방송 프로그램 앞뒤에 삽입되어 있던 지상파 방송사의 광고를 각 서비스 사업자의 계약 광고로 대체하므로, 광고를 분리하여 방송 프로그램의 본 내용만 인코딩하는 과정이 필요하다. 본 논문에서는 이와 같은 재가공 작업을 위해 방송프로그램 스트림에서 본 내용의 시작 종료점을 검출하는 방법을 제안한다. 이는 디지털 방송 프로그램 스트림으로부터 영상의 특징과 자막 데이터를 추출, 분석하여 판별하는 방법으로, 개별 광고의 특징 데이터를 이용하지 않고 처리한다. 따라서 방송 콘텐츠를 재가공하기 위한 인코딩 시스템에, 모든 광고들을 미리 분석하고 특징 데이터를 추출하는 전처리 과정없이 적용이 가능하다.

  • PDF

Extraction of Features in key frames of News Video for Content-based Retrieval (내용 기반 검색을 위한 뉴스 비디오 키 프레임의 특징 정보 추출)

  • Jung, Yung-Eun;Lee, Dong-Seop;Jeon, Keun-Hwan;Lee, Yang-Weon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.9
    • /
    • pp.2294-2301
    • /
    • 1998
  • The aim of this paper is to extract features from each news scenes for example, symbol icon which can be distinct each broadcasting corp, icon and caption which are has feature and important information for the scene in respectively, In this paper, we propose extraction methods of caption that has important prohlem of news videos and it can be classified in three steps, First of al!, we converted that input images from video frame to YIQ color vector in first stage. And then, we divide input image into regions in clear hy using equalized color histogram of input image, In last, we extracts caption using edge histogram based on vertical and horizontal line, We also propose the method which can extract news icon in selected key frames by the difference of inter-histogram and can divide each scene by the extracted icon. In this paper, we used comparison method of edge histogram instead of complex methcxls based on color histogram or wavelet or moving objects, so we shorten computation through using simpler algorithm. and we shown good result of feature's extraction.

  • PDF

Extraction Analysis for Crossmodal Association Information using Hypernetwork Models (하이퍼네트워크 모델을 이용한 비전-언어 크로스모달 연관정보 추출)

  • Heo, Min-Oh;Ha, Jung-Woo;Zhang, Byoung-Tak
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.278-284
    • /
    • 2009
  • Multimodal data to have several modalities such as videos, images, sounds and texts for one contents is increasing. Since this type of data has ill-defined format, it is not easy to represent the crossmodal information for them explicitly. So, we proposed new method to extract and analyze vision-language crossmodal association information using the documentaries video data about the nature. We collected pairs of images and captions from 3 genres of documentaries such as jungle, ocean and universe, and extracted a set of visual words and that of text words from them. We found out that two modal data have semantic association on crossmodal association information from this analysis.

  • PDF

Metadata extraction using AI and advanced metadata research for web services (AI를 활용한 메타데이터 추출 및 웹서비스용 메타데이터 고도화 연구)

  • Sung Hwan Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.499-503
    • /
    • 2024
  • Broadcasting programs are provided to various media such as Internet replay, OTT, and IPTV services as well as self-broadcasting. In this case, it is very important to provide keywords for search that represent the characteristics of the content well. Broadcasters mainly use the method of manually entering key keywords in the production process and the archive process. This method is insufficient in terms of quantity to secure core metadata, and also reveals limitations in recommending and using content in other media services. This study supports securing a large number of metadata by utilizing closed caption data pre-archived through the DTV closed captioning server developed in EBS. First, core metadata was automatically extracted by applying Google's natural language AI technology. The next step is to propose a method of finding core metadata by reflecting priorities and content characteristics as core research contents. As a technology to obtain differentiated metadata weights, the importance was classified by applying the TF-IDF calculation method. Successful weight data were obtained as a result of the experiment. The string metadata obtained by this study, when combined with future string similarity measurement studies, becomes the basis for securing sophisticated content recommendation metadata from content services provided to other media.

Investigating an Automatic Method for Summarizing and Presenting a Video Speech Using Acoustic Features (음향학적 자질을 활용한 비디오 스피치 요약의 자동 추출과 표현에 관한 연구)

  • Kim, Hyun-Hee
    • Journal of the Korean Society for information Management
    • /
    • v.29 no.4
    • /
    • pp.191-208
    • /
    • 2012
  • Two fundamental aspects of speech summary generation are the extraction of key speech content and the style of presentation of the extracted speech synopses. We first investigated whether acoustic features (speaking rate, pitch pattern, and intensity) are equally important and, if not, which one can be effectively modeled to compute the significance of segments for lecture summarization. As a result, we found that the intensity (that is, difference between max DB and min DB) is the most efficient factor for speech summarization. We evaluated the intensity-based method of using the difference between max-DB and min-DB by comparing it to the keyword-based method in terms of which method produces better speech summaries and of how similar weight values assigned to segments by two methods are. Then, we investigated the way to present speech summaries to the viewers. As such, for speech summarization, we suggested how to extract key segments from a speech video efficiently using acoustic features and then present the extracted segments to the viewers.

Conversation Context Annotation using Speaker Detection (화자인식을 이용한 대화 상황정보 어노테이션)

  • Park, Seung-Bo;Kim, Yoo-Won;Jo, Geun-Sik
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.9
    • /
    • pp.1252-1261
    • /
    • 2009
  • One notable challenge in video searching and summarizing is extracting semantic from video contents and annotating context for video contents. Video semantic or context could be obtained by two methods to extract objects and contexts between objects from video. However, the method that use just to extracts objects do not express enough semantic for shot or scene as it does not describe relation and interaction between objects. To be more effective, after extracting some objects, context like relation and interaction between objects needs to be extracted from conversation situation. This paper is a study for how to detect speaker and how to compose context for talking to annotate conversation context. For this, based on this study, we proposed the methods that characters are recognized through face recognition technology, speaker is detected through mouth motion, conversation context is extracted using the rule that is composed of speaker existing, the number of characters and subtitles existing and, finally, scene context is changed to xml file and saved.

  • PDF

A Viewer Preference Model Based on Physiological Feedback (CogTV를 위한 생체신호기반 시청자 선호도 모델)

  • Park, Tae-Suh;Kim, Byoung-Hee;Zhang, Byoung-Tak
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.3
    • /
    • pp.316-322
    • /
    • 2014
  • A movie recommendation system is proposed to learn a preference model of a viewer by using multimodal features of a video content and their evoked implicit responses of the viewer in synchronized manner. In this system, facial expression, body posture, and physiological signals are measured to estimate the affective states of the viewer, in accordance with the stimuli consisting of low-level and affective features from video, audio, and text streams. Experimental results show that it is possible to predict arousal response, which is measured by electrodermal activity, of a viewer from auditory and text features in a video stimuli, for estimating interestingness on the video.