• Title/Summary/Keyword: Shot Detection

Search Result 212, Processing Time 0.026 seconds

Design and Implementation of a news Archive System using Shot Types (샷의 타입을 이용한 뉴스 아카이브 시스템의 설계 및 구현)

  • Han, Keun-Ju;Nang, Jong-Ho;Ha, Myung-Hwan;Jung, Byung-Hee;Kim, Kyeong-Soo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.5
    • /
    • pp.416-428
    • /
    • 2001
  • In order to build a news archive system. the news video stream should be first segmented into several articles, ad their contents are abstracted effectively. This abstraction helps the users to understand the contents of the article without playing the whole video stream. This paper proposes a new article boundary detection scheme for the news video streams together with a new news article abstraction scheme using the shot types of the news video data. The shots in the news video are classified into anchor person shots, interview shots, speech shots, reporting shots, graphic shots, and others. Since the news article starts with an anchor shot whose duration is relatively longer than other shots with special screen structure, the article boundary in detected by the computing the length of the shot and checking the screen structure in the proposed scheme. For the effective abstraction of the article video, the graphic image located in the right-top of the anchor shot frames is primarily used in the proposed abstraction scheme since it is the abstraction of the article made by the producer of the news according to its contents so that it contains a lot of meaningful information. The key frames of the other shots except interview and report shots are also used to abstract the contents of the articles in the proposed scheme. Upon experimental results, the precision and recall values of the proposed article boundary detection scheme could be 92% and 96%, respectively. This paper also presents a design and implementation of a prototype news archive system on WWW that consists of an indexing tool, an authoring tool, a database for meta-data of the news, and a browsing tool.

  • PDF

Automatic Indexing Algorithm of Golf Video Using Audio Information (오디오 정보를 이용한 골프 동영상 자동 색인 알고리즘)

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.5
    • /
    • pp.441-446
    • /
    • 2009
  • This paper proposes an automatic indexing algorithm of golf video using audio information. In the proposed algorithm, the input audio stream is demultiplexed into the stream of video and audio. By means of Adaboost-cascade classifier, the continuous audio stream is classified into announcer's speech segment recorded in studio, music segment accompanied with players' names on TV screen, reaction segment of audience according to the play, reporter's speech segment with field background, filed noise segment like wind or waves. And golf swing sound including drive shot, iron shot, and putting shot is detected by the method of impulse onset detection and modulation spectrum verification. The detected swing and applause are used effectively to index action or highlight unit. Compared with video based semantic analysis, main advantage of the proposed system is its small computation requirement so that it facilitates to apply the technology to embedded consumer electronic devices for fast browsing.

MPEG Video Segmentation Using Frame Feature Comparison (프레임 특징 비교를 이용한 압축비디오 분할)

  • 김영호;강대성
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.4 no.2
    • /
    • pp.25-30
    • /
    • 2003
  • Recently, development of digital technology is occupying a large part of multimedia information like character, voice, image, video, etc. Research about video indexing and retrieval progresses especially in research relative to video. In this paper, we propose new algorithm(Frame Feature Comparison) for MPEG video segmentation. Shot, Scene Change detection is basic and important works that segment it in MPEG video sequence. Generally, the segmentation algorithm that uses much has defect that occurs an error detection according to a flash of camera, movement of camera and fast movement of an object, because of comparing former frames with present frames. Therefore, we distinguish a scene change one more time using a scene change point detected in the conventional algorithm through comparing its mean value with abutted frames. In the result, we could detect more corrective scene change than the conventional algorithm.

  • PDF

CREATING JOYFUL DIGESTS BY EXPLOITING SMILE/LAUGHTER FACIAL EXPRESSIONS PRESENT IN VIDEO

  • Kowalik, Uwe;Hidaka, Kota;Irie, Go;Kojima, Akira
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2009.01a
    • /
    • pp.267-272
    • /
    • 2009
  • Video digests provide an effective way of confirming a video content rapidly due to their very compact form. By watching a digest, users can easily check whether a specific content is worth seeing in full. The impression created by the digest greatly influences the user's choice in selecting video contents. We propose a novel method of automatic digest creation that evokes a joyful impression through the created digest by exploiting smile/laughter facial expressions as emotional cues of joy from video. We assume that a digest presenting smiling/laughing faces appeals to the user since he/she is assured that the smile/laughter expression is caused by joyful events inside the video. For detecting smile/laughter faces we have developed a neural network based method for classifying facial expressions. Video segmentation is performed by automatic shot detection. For creating joyful digests, appropriate shots are automatically selected by shot ranking based on the smile/laughter detection result. We report the results of user trials conducted for assessing the visual impression with automatically created 'joyful' digests produced by our system. The results show that users tend to prefer emotional digests containing laughter faces. This result suggests that the attractiveness of automatically created video digests can be improved by extracting emotional cues of the contents through automatic facial expression analysis as proposed in this paper.

  • PDF

Statistical Model for Emotional Video Shot Characterization (비디오 셧의 감정 관련 특징에 대한 통계적 모델링)

  • 박현재;강행봉
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.12C
    • /
    • pp.1200-1208
    • /
    • 2003
  • Affective computing plays an important role in intelligent Human Computer Interactions(HCI). To detect emotional events, it is desirable to construct a computing model for extracting emotion related features from video. In this paper, we propose a statistical model based on the probabilistic distribution of low level features in video shots. The proposed method extracts low level features from video shots and then from a GMM(Gaussian Mixture Model) for them to detect emotional shots. As low level features, we use color, camera motion and sequence of shot lengths. The features can be modeled as a GMM by using EM(Expectation Maximization) algorithm and the relations between time and emotions are estimated by MLE(Maximum Likelihood Estimation). Finally, the two statistical models are combined together using Bayesian framework to detect emotional events in video.

Taking a Jump Motion Picture Automatically by using Accelerometer of Smart Phones (스마트폰 가속도계를 이용한 점프동작 자동인식 촬영)

  • Choi, Kyungyoon;Jun, Kyungkoo
    • Journal of KIISE
    • /
    • v.41 no.9
    • /
    • pp.633-641
    • /
    • 2014
  • This paper proposes algorithms to detect jump motion and automatically take a picture when the jump reaches its top. Based on the algorithms, we build jump-shot system by using accelerometer-equipped smart phones. Since the jump motion may vary depending on one's physical condition, gender, and age, it is critical to figure out common features which are independent from such differences. Also it is obvious that the detection algorithm needs to work in real-time because of the short duration of the jump. We propose two different algorithms considering these requirements and develop the system as a smart phone application. Through a series of experiments, we show that the system is able to successfully detect the jump motion and take a picture when it reaches the top.

Resource-Efficient Object Detector for Low-Power Devices (저전력 장치를 위한 자원 효율적 객체 검출기)

  • Akshay Kumar Sharma;Kyung Ki Kim
    • Transactions on Semiconductor Engineering
    • /
    • v.2 no.1
    • /
    • pp.17-20
    • /
    • 2024
  • This paper presents a novel lightweight object detection model tailored for low-powered edge devices, addressing the limitations of traditional resource-intensive computer vision models. Our proposed detector, inspired by the Single Shot Detector (SSD), employs a compact yet robust network design. Crucially, it integrates an 'enhancer block' that significantly boosts its efficiency in detecting smaller objects. The model comprises two primary components: the Light_Block for efficient feature extraction using Depth-wise and Pointwise Convolution layers, and the Enhancer_Block for enhanced detection of tiny objects. Trained from scratch on the Udacity Annotated Dataset with image dimensions of 300x480, our model eschews the need for pre-trained classification weights. Weighing only 5.5MB with approximately 0.43M parameters, our detector achieved a mean average precision (mAP) of 27.7% and processed at 140 FPS, outperforming conventional models in both precision and efficiency. This research underscores the potential of lightweight designs in advancing object detection for edge devices without compromising accuracy.

2D-to-3D Stereoscopic conversion: Depth estimation in monoscopic soccer videos (단일 시점 축구 비디오의 3차원 영상 변환을 위한 깊이지도 생성 방법)

  • Ko, Jae-Seung;Kim, Young-Woo;Jung, Young-Ju;Kim, Chang-Ick
    • Journal of Broadcast Engineering
    • /
    • v.13 no.4
    • /
    • pp.427-439
    • /
    • 2008
  • This paper proposes a novel method to convert monoscopic soccer videos to stereoscopic videos. Through the soccer video analysis process, we detect shot boundaries and classify soccer frames into long shot or non-long shot. In the long shot case, the depth mapis generated relying on the size of the extracted ground region. For the non-long shot case, the shot is further partitioned into three types by considering the number of ground blocks and skin blocks which is obtained by a simple skin-color detection method. Then three different depth assignment methods are applied to each non-long shot types: 1) Depth estimation by object region extraction, 2) Foreground estimation by using the skin block and depth value computation by Gaussian function, and 3)the depth map generation for shots not containing the skin blocks. This depth assignment is followed by stereoscopic image generation. Subjective evaluation comparing generated depth maps and corresponding stereoscopic images indicate that the proposed algorithm can yield the sense of depth from a single view images.

Sub-Frame Analysis-based Object Detection for Real-Time Video Surveillance

  • Jang, Bum-Suk;Lee, Sang-Hyun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.76-85
    • /
    • 2019
  • We introduce a vision-based object detection method for real-time video surveillance system in low-end edge computing environments. Recently, the accuracy of object detection has been improved due to the performance of approaches based on deep learning algorithm such as Region Convolutional Neural Network(R-CNN) which has two stage for inferencing. On the other hand, one stage detection algorithms such as single-shot detection (SSD) and you only look once (YOLO) have been developed at the expense of some accuracy and can be used for real-time systems. However, high-performance hardware such as General-Purpose computing on Graphics Processing Unit(GPGPU) is required to still achieve excellent object detection performance and speed. To address hardware requirement that is burdensome to low-end edge computing environments, We propose sub-frame analysis method for the object detection. In specific, We divide a whole image frame into smaller ones then inference them on Convolutional Neural Network (CNN) based image detection network, which is much faster than conventional network designed forfull frame image. We reduced its computationalrequirementsignificantly without losing throughput and object detection accuracy with the proposed method.

Generating Extreme Close-up Shot Dataset Based On ROI Detection For Classifying Shots Using Artificial Neural Network (인공신경망을 이용한 샷 사이즈 분류를 위한 ROI 탐지 기반의 익스트림 클로즈업 샷 데이터 셋 생성)

  • Kang, Dongwann;Lim, Yang-mi
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.983-991
    • /
    • 2019
  • This study aims to analyze movies which contain various stories according to the size of their shots. To achieve this, it is needed to classify dataset according to the shot size, such as extreme close-up shots, close-up shots, medium shots, full shots, and long shots. However, a typical video storytelling is mainly composed of close-up shots, medium shots, full shots, and long shots, it is not an easy task to construct an appropriate dataset for extreme close-up shots. To solve this, we propose an image cropping method based on the region of interest (ROI) detection. In this paper, we use the face detection and saliency detection to estimate the ROI. By cropping the ROI of close-up images, we generate extreme close-up images. The dataset which is enriched by proposed method is utilized to construct a model for classifying shots based on its size. The study can help to analyze the emotional changes of characters in video stories and to predict how the composition of the story changes over time. If AI is used more actively in the future in entertainment fields, it is expected to affect the automatic adjustment and creation of characters, dialogue, and image editing.