• Title/Summary/Keyword: Video Summarization

Search Result 60, Processing Time 0.025 seconds

Multimodal Approach for Summarizing and Indexing News Video

  • Kim, Jae-Gon;Chang, Hyun-Sung;Kim, Young-Tae;Kang, Kyeong-Ok;Kim, Mun-Churl;Kim, Jin-Woong;Kim, Hyung-Myung
    • ETRI Journal
    • /
    • v.24 no.1
    • /
    • pp.1-11
    • /
    • 2002
  • A video summary abstracts the gist from an entire video and also enables efficient access to the desired content. In this paper, we propose a novel method for summarizing news video based on multimodal analysis of the content. The proposed method exploits the closed caption data to locate semantically meaningful highlights in a news video and speech signals in an audio stream to align the closed caption data with the video in a time-line. Then, the detected highlights are described using MPEG-7 Summarization Description Scheme, which allows efficient browsing of the content through such functionalities as multi-level abstracts and navigation guidance. Multimodal search and retrieval are also within the proposed framework. By indexing synchronized closed caption data, the video clips are searchable by inputting a text query. Intensive experiments with prototypical systems are presented to demonstrate the validity and reliability of the proposed method in real applications.

  • PDF

Toward a Structural and Semantic Metadata Framework for Efficient Browsing and Searching of Web Videos

  • Kim, Hyun-Hee
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.1
    • /
    • pp.227-243
    • /
    • 2017
  • This study proposed a structural and semantic framework for the characterization of events and segments in Web videos that permits content-based searches and dynamic video summarization. Although MPEG-7 supports multimedia structural and semantic descriptions, it is not currently suitable for describing multimedia content on the Web. Thus, the proposed metadata framework that was designed considering Web environments provides a thorough yet simple way to describe Web video contents. Precisely, the metadata framework was constructed on the basis of Chatman's narrative theory, three multimedia metadata formats (PBCore, MPEG-7, and TV-Anytime), and social metadata. It consists of event information, eventGroup information, segment information, and video (program) information. This study also discusses how to automatically extract metadata elements including structural and semantic metadata elements from Web videos.

Creation of Soccer Video Highlight Using The Structural Features of Caption (자막의 구조적 특징을 이용한 축구 비디오 하이라이트 생성)

  • Huh, Moon-Haeng;Shin, Seong-Yoon;Lee, Yang-Weon;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.10D no.4
    • /
    • pp.671-678
    • /
    • 2003
  • A digital video is usually very long temporally. requiring large storage capacity. Therefore, users want to watch pre-summarized video before they watch a large long video. Especially in the field of sports video, they want to watch a highlight video. Consequently, highlight video is used that the viewers decide whether it is valuable for them to watch the video or not. This paper proposes how to create soccer video highlight using the structural features of the caption such as temporal and spatial features. Caption frame intervals and caption key frames are extracted by using those structural features. And then, highlight video is created by using scene relocation, logical indexing and highlight creation rule. Finally. retrieval and browsing of highlight and video segment is performed by selection of item on browser.

Soccer Video Summarization Using Event-Caption (이벤트-캡션을 이용한 축구비디오 요약)

  • 신성윤;하연실;고경철;이양원
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2001.11a
    • /
    • pp.245-248
    • /
    • 2001
  • 비디오 데이터에서 캡션은 비디오의 중요한 부분과 내용을 나타내는 가장 보편적이 방법이다. 본 논문에서는 축구 비디오에서 캡션이 갖는 특징을 분석하고 캡션에 의한 키 프레임을 추출하도록 하며, 비디오 요약 생성 규칙에 따라 요약된 비디오를 생성하도록 한다. 키 프레임 추출은 이벤트 발생에 따른 캡션의 등장과 캡션 내용의 변화를 추출하는 것으로 탬플리트 매칭과 지역적 차영상을 통하여 추출하며 샷의 재설정 통하여 중요한 이벤트를 포함한 요약된 비디오를 생성하도록 한다.

  • PDF

Soccer Video Summarization Using Caption Analysis (자막 분석을 이용한 축구 비디오 요약)

  • 임정훈;국나영;곽순영;강일고;이양원
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2002.11b
    • /
    • pp.77-80
    • /
    • 2002
  • 비디오 데이터에서 캡션은 비디오의 중요한 부분과 내용을 나타내는 가장 보편적인 방법이다. 본 논문에서는 축구 비디오에서 캡션이 갖는 특징을 분석하고 캡션에 의한 키 프레임을 추출하도록 하며, 비디오 요약 생성 규칙에 따라 요약된 비디오를 생성하도록 한다. 키 프레임 추출은 이벤트 발생에 따른 캡션의 등장과 캡션 내용의 변화를 추출하는 것으로 탬플리트 매칭과 지역적 차영상을 통하여 추출하며 샷의 재설정 통하여 중요한 이벤트를 포함한 요약된 비디오를 생성하도록 한다.

  • PDF

An Automatic Cut Detection Algorithm Using Median Filter And Neural Network (중간값 필터와 신경망 회로를 사용한 자동 컷 검출 알고리즘)

  • Jun, Seung-Chul;Park, Sung-Han
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.39 no.4
    • /
    • pp.381-387
    • /
    • 2002
  • In this paper, an efficient method to find shot boundaries in the MPEG video stream data is proposed. For this purpose, we first assume that the histogram difference value(HDV) and pixel difference value(PDV) as an one dimensional signal and apply the median filter to these signals. The output of the median filter is subtracted from the original signal to produce the median filtered difference(MFD). The MFD is a criterion of shot boundary. In addition a neural network is employed and trained to find exactly cut boundary. The proposed algorithm shows that the cut boundaries are well extracted, especially in a dynamic video.

Semantic Event Detection in Golf Video Using Hidden Markov Model (은닉 마코프 모델을 이용한 골프 비디오의 시멘틱 이벤트 검출)

  • Kim Cheon Seog;Choo Jin Ho;Bae Tae Meon;Jin Sung Ho;Ro Yong Man
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1540-1549
    • /
    • 2004
  • In this paper, we propose an algorithm to detect semantic events in golf video using Hidden Markov Model. The purpose of this paper is to identify and classify the golf events to facilitate highlight-based video indexing and summarization. In this paper we first define 4 semantic events, and then design HMM model with states made up of each event. We also use 10 multiple visual features based on MPEG-7 visual descriptors to acquire parameters of HMM for each event. Experimental results showed that the proposed algorithm provided reasonable detection performance for identifying a variety of golf events.

  • PDF

Real-Time Video Indexing and Non-Linear Video Browsing for DTV Receivers (디지털 텔레비전 수신환경에서의 실시간 비디오 인덱싱과 비선형적 비디오 브라우징)

  • 윤경로;전성배
    • Journal of Broadcast Engineering
    • /
    • v.7 no.2
    • /
    • pp.79-87
    • /
    • 2002
  • The fast advances in digital video processing and multimedia processing technology over the last decade enabled various non-linear video browsing techniques. Based on the machine-understanding of the video content, non-linear video brows ing interfaces such as key-frame based content summarization have been introduced. The key-frame based user interfaces, such as storyboard or table of content, however, are still very hard for conventional TV users to use, and are very hard to implement without the service providers providing additional information for the construction of the key-frame based interfaces. In this paper, non-linear video browsing techniques, which not only overcome previously described drawbacks but also are easy-to-use, and real-time video indexing technology to support the proposed browsing techniques are proposed. The structure-based skipping and skimming help users easily find interesting scene and understand the content in a very short time, using real-time video indexing technology.

Automatic Summary Method of Linguistic Educational Video Using Multiple Visual Features (다중 비주얼 특징을 이용한 어학 교육 비디오의 자동 요약 방법)

  • Han Hee-Jun;Kim Cheon-Seog;Choo Jin-Ho;Ro Yong-Man
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.10
    • /
    • pp.1452-1463
    • /
    • 2004
  • The requirement of automatic video summary is increasing as bi-directional broadcasting contents and various user requests and preferences for the bi -directional broadcast environment are increasing. Automatic video summary is needed for an efficient management and usage of many contents in service provider as well. In this paper, we propose a method to generate a content-based summary of linguistic educational videos automatically. First, shot-boundaries and keyframes are generated from linguistic educational video and then multiple(low-level) visual features are extracted. Next, the semantic parts (Explanation part, Dialog part, Text-based part) of the linguistic educational video are generated using extracted visual features. Lastly the XMI- document describing summary information is made based on HieraTchical Summary architecture oi MPEG-7 MDS (Multimedia I)escription Scheme). Experimental results show that our proposed algorithm provides reasonable performance for automatic summary of linguistic educational videos. We verified that the proposed method is useful ior video summary system to provide various services as well as management of educational contents.

  • PDF

Improved Quality Keyframe Selection Method for HD Video

  • Yang, Hyeon Seok;Lee, Jong Min;Jeong, Woojin;Kim, Seung-Hee;Kim, Sun-Joong;Moon, Young Shik
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3074-3091
    • /
    • 2019
  • With the widespread use of the Internet, services for providing large-capacity multimedia data such as video-on-demand (VOD) services and video uploading sites have greatly increased. VOD service providers want to be able to provide users with high-quality keyframes of high quality videos within a few minutes after the broadcast ends. However, existing keyframe extraction tends to select keyframes whose quality as a keyframe is insufficiently considered, and it takes a long computation time because it does not consider an HD class image. In this paper, we propose a keyframe selection method that flexibly applies multiple keyframe quality metrics and improves the computation time. The main procedure is as follows. After shot boundary detection is performed, the first frames are extracted as initial keyframes. The user sets evaluation metrics and priorities by considering the genre and attributes of the video. According to the evaluation metrics and the priority, the low-quality keyframe is selected as a replacement target. The replacement target keyframe is replaced with a high-quality frame in the shot. The proposed method was subjectively evaluated by 23 votes. Approximately 45% of the replaced keyframes were improved and about 18% of the replaced keyframes were adversely affected. Also, it took about 10 minutes to complete the summary of one hour video, which resulted in a reduction of more than 44.5% of the execution time.