• Title/Summary/Keyword: Keyframes

Search Result 26, Processing Time 0.022 seconds

Improved Quality Keyframe Selection Method for HD Video

  • Yang, Hyeon Seok;Lee, Jong Min;Jeong, Woojin;Kim, Seung-Hee;Kim, Sun-Joong;Moon, Young Shik
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.6
    • /
    • pp.3074-3091
    • /
    • 2019
  • With the widespread use of the Internet, services for providing large-capacity multimedia data such as video-on-demand (VOD) services and video uploading sites have greatly increased. VOD service providers want to be able to provide users with high-quality keyframes of high quality videos within a few minutes after the broadcast ends. However, existing keyframe extraction tends to select keyframes whose quality as a keyframe is insufficiently considered, and it takes a long computation time because it does not consider an HD class image. In this paper, we propose a keyframe selection method that flexibly applies multiple keyframe quality metrics and improves the computation time. The main procedure is as follows. After shot boundary detection is performed, the first frames are extracted as initial keyframes. The user sets evaluation metrics and priorities by considering the genre and attributes of the video. According to the evaluation metrics and the priority, the low-quality keyframe is selected as a replacement target. The replacement target keyframe is replaced with a high-quality frame in the shot. The proposed method was subjectively evaluated by 23 votes. Approximately 45% of the replaced keyframes were improved and about 18% of the replaced keyframes were adversely affected. Also, it took about 10 minutes to complete the summary of one hour video, which resulted in a reduction of more than 44.5% of the execution time.

Structural similarity based efficient keyframes extraction from multi-view videos (구조적인 유사성에 기반한 다중 뷰 비디오의 효율적인 키프레임 추출)

  • Hussain, Tanveer;Khan, Salman;Muhammad, Khan;Lee, Mi Young;Baik, Sung Wook
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.14 no.6
    • /
    • pp.7-14
    • /
    • 2018
  • Salient information extraction from multi-view videos is a very challenging area because of inter-view, intra-view correlations, and computational complexity. There are several techniques developed for keyframes extraction from multi-view videos with very high computational complexities. In this paper, we present a keyframes extraction approach from multi-view videos using entropy and complexity information present inside frame. In first step, we extract representative shots of the whole video from each view based on structural similarity index measurement (SSIM) difference value between frames. In second step, entropy and complexity scores for all frames of shots in different views are computed. Finally, the frames with highest entropy and complexity scores are considered as keyframes. The proposed system is subjectively evaluated on available office benchmark dataset and the results are convenient in terms of accuracy and time complexity.

Gesture-Based Emotion Recognition by 3D-CNN and LSTM with Keyframes Selection

  • Ly, Son Thai;Lee, Guee-Sang;Kim, Soo-Hyung;Yang, Hyung-Jeong
    • International Journal of Contents
    • /
    • v.15 no.4
    • /
    • pp.59-64
    • /
    • 2019
  • In recent years, emotion recognition has been an interesting and challenging topic. Compared to facial expressions and speech modality, gesture-based emotion recognition has not received much attention with only a few efforts using traditional hand-crafted methods. These approaches require major computational costs and do not offer many opportunities for improvement as most of the science community is conducting their research based on the deep learning technique. In this paper, we propose an end-to-end deep learning approach for classifying emotions based on bodily gestures. In particular, the informative keyframes are first extracted from raw videos as input for the 3D-CNN deep network. The 3D-CNN exploits the short-term spatiotemporal information of gesture features from selected keyframes, and the convolutional LSTM networks learn the long-term feature from the features results of 3D-CNN. The experimental results on the FABO dataset exceed most of the traditional methods results and achieve state-of-the-art results for the deep learning-based technique for gesture-based emotion recognition.

Enhancing Motion Capture Data (모션 캡쳐 데이터 향상 기법)

  • 최광진
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 1998.10a
    • /
    • pp.120-123
    • /
    • 1998
  • In animating an articulated entity with motion capture data, especially when the reconstruction is based on forward kinematics, there could be large discrepancies at the end effector. The small errors in joint angles tend to be amplified as the forward kinematics positioning progresses toward the end effector. In this paper, we present an algorithm that enhances the motion capture data to reduce positional errors at the end effector. The process is optimized so that the characteristics of the original joint angle data is preserved in the resulting motion. The frames at which the end-effector position needs to be accurate are designated as“keyframes”(e.g. starting and ending frames). In the algorithm, corrections by inverse kinematics are performed at sparse keyframes and they are interpolated with a cubic spline which produces a curve best approximating the measured joint angles. The experiment proves that our algorithm is a valuable tool to improve measured motion especially when end-effector trajectory contains a special goal.

  • PDF

Video Copy Detection Algorithm Against Online Piracy of DTV Broadcast Program (DTV 방송프로그램의 온라인 불법전송 차단을 위한 비디오 복사본 검출 알고리즘)

  • Kim, Joo-Sub;Nam, Je-Ho
    • Journal of Broadcast Engineering
    • /
    • v.13 no.5
    • /
    • pp.662-676
    • /
    • 2008
  • This paper presents a video copy detection algorithm that blocks online transfer of illegally copied DTV broadcast programs. Particularly, the proposed algorithm establishes a set of keyframes by detecting abrupt changes of luminance, and then exploits the spatio-temporal features of keyframes. Comparing with the preregistered features stored in the database of DTV broadcast programs, the proposed scheme performs a function of video filtering in order to distinguish whether an uploaded video is illegally copied or not. Note that we analyze only a set of keyframes instead of an entire video frame. Thus, it is highly efficient to identify illegal copied video when we deal with a vast size of broadcast programs. Also, we confirm that the proposed technique is robust to a variety of video edit-effects that are often applied by online video redistribution, such as apsect-ratio change, logo insertion, caption insertion, visual quality degradation, and resolution change (downscaling). In addition, we perform a benchmark test in which the proposed scheme outperforms previous techniques.

A Research on the Teaser Video Production Method by Keyframe Extraction Based on YCbCr Color Model (YCbCr 컬러모델 기반의 키프레임 추출을 통한 티저 영상 제작 방법에 대한 연구)

  • Lee, Seo-young;Park, Hyo-Gyeong;Young, Sung-Jung;You, Yeon-Hwi;Moon, Il-Young
    • Journal of Practical Engineering Education
    • /
    • v.14 no.2
    • /
    • pp.439-445
    • /
    • 2022
  • Due to the development of online media platforms and the COVID-19 incident, the mass production and consumption of digital video content are rapidly increasing. In order to select digital video content, users grasp it in a short time through thumbnails and teaser videos, and select and watch digital video content that suits them. It is very inconvenient to check all digital video contents produced around the world one by one and manually edit teaser videos for users to choose from. In this paper, keyframes are extracted based on YCbCr color models to automatically generate teaser videos, and keyframes extracted through clustering are optimized. Finally, we present a method of producing a teaser video to help users check digital video content by connecting the finally extracted keyframes.

Automatic Generation of Video Metadata for the Super-personalized Recommendation of Media

  • Yong, Sung Jung;Park, Hyo Gyeong;You, Yeon Hwi;Moon, Il-Young
    • Journal of information and communication convergence engineering
    • /
    • v.20 no.4
    • /
    • pp.288-294
    • /
    • 2022
  • The media content market has been growing, as various types of content are being mass-produced owing to the recent proliferation of the Internet and digital media. In addition, platforms that provide personalized services for content consumption are emerging and competing with each other to recommend personalized content. Existing platforms use a method in which a user directly inputs video metadata. Consequently, significant amounts of time and cost are consumed in processing large amounts of data. In this study, keyframes and audio spectra based on the YCbCr color model of a movie trailer were extracted for the automatic generation of metadata. The extracted audio spectra and image keyframes were used as learning data for genre recognition in deep learning. Deep learning was implemented to determine genres among the video metadata, and suggestions for utilization were proposed. A system that can automatically generate metadata established through the results of this study will be helpful for studying recommendation systems for media super-personalization.

Movement Detection Using Keyframes in Video Surveillance System

  • Kim, Kyutae;Jia, Qiong;Dong, Tianyu;Jang, Euee S.
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1249-1252
    • /
    • 2022
  • In this paper, we propose a conceptual framework that identifies video frames in motion containing the movement of people and vehicles in traffic videos. The automatic selection of video frames in motion is an important topic in security and surveillance video because the number of videos to be monitored simultaneously is simply too large due to limited human resources. The conventional method to identify the areas in motion is to compute the differences over consecutive video frames, which has been costly because of its high computational complexity. In this paper, we reduced the overall complexity by examining only the keyframes (or I-frames). The basic assumption is that the time period between I-frames is rather shorter (e.g., 1/10 ~ 3 secs) than the usual length of objects in motion in video (i.e., pedestrian walking, automobile passing, etc.). The proposed method estimates the possibility of videos containing motion between I-frames by evaluating the difference of consecutive I-frames with the long-time statistics of the previously decoded I-frames of the same video. The experimental results showed that the proposed method showed more than 80% accuracy in short surveillance videos obtained from different locations while keeping the computational complexity as low as 20 % compared to the HM decoder.

  • PDF

Automatic Poster Generation System Using Protagonist Face Analysis

  • Yeonhwi You;Sungjung Yong;Hyogyeong Park;Seoyoung Lee;Il-Young Moon
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.4
    • /
    • pp.287-293
    • /
    • 2023
  • With the rapid development of domestic and international over-the-top markets, a large amount of video content is being created. As the volume of video content increases, consumers tend to increasingly check data concerning the videos before watching them. To address this demand, video summaries in the form of plot descriptions, thumbnails, posters, and other formats are provided to consumers. This study proposes an approach that automatically generates posters to effectively convey video content while reducing the cost of video summarization. In the automatic generation of posters, face recognition and clustering are used to gather and classify character data, and keyframes from the video are extracted to learn the overall atmosphere of the video. This study used the facial data of the characters and keyframes as training data and employed technologies such as DreamBooth, a text-to-image generation model, to automatically generate video posters. This process significantly reduces the time and cost of video-poster production.

Creation of Soccer Video Highlights Using Caption Information (자막 정보를 이용한 축구 비디오 하이라이트 생성)

  • Shin Seong-Yoon;Kang Il-Ko;Rhee Yang-Won
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.65-76
    • /
    • 2005
  • A digital video is a very long data that requires large-capacity storage space. As such, prior to watching a long original video, video watchers want to watch a summarized version of the video. In the field of sports, in particular, highlights videos are frequently watched. In short, a highlights video allows a video watcher to determine whether the highlights video is well worth watching. This paper proposes a scheme for creating soccer video highlights using the structural features of captions in terms of time and space. Such structural features are used to extract caption frame intervals and caption keyframes. A highlights video is created through resetting shots for caption keyframes, by means of logical indexing, and through the use of the rule for creating highlights. Finally, highlights videos and video segments can be searched and browsed in a way that allows the video watcher to select his/her desired items from the browser.

  • PDF