• Title/Summary/Keyword: video fusion

Search Result 118, Processing Time 0.023 seconds

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

  • Ren, Qun
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.556-570
    • /
    • 2021
  • The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.

Video Highlight Prediction Using Multiple Time-Interval Information of Chat and Audio (채팅과 오디오의 다중 시구간 정보를 이용한 영상의 하이라이트 예측)

  • Kim, Eunyul;Lee, Gyemin
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.553-563
    • /
    • 2019
  • As the number of videos uploaded on live streaming platforms rapidly increases, the demand for providing highlight videos is increasing to promote viewer experiences. In this paper, we present novel methods for predicting highlights using chat logs and audio data in videos. The proposed models employ bi-directional LSTMs to understand the contextual flow of a video. We also propose to use the features over various time-intervals to understand the mid-to-long term flows. The proposed Our methods are demonstrated on e-Sports and baseball videos collected from personal broadcasting platforms such as Twitch and Kakao TV. The results show that the information from multiple time-intervals is useful in predicting video highlights.

Adaptive Depth Fusion based on Reliability of Depth Cues for 2D-to-3D Video Conversion (2차원 동영상의 3차원 변환을 위한 깊이 단서의 신뢰성 기반 적응적 깊이 융합)

  • Han, Chan-Hee;Choi, Hae-Chul;Lee, Si-Woong
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.12
    • /
    • pp.1-13
    • /
    • 2012
  • 3D video is regarded as the next generation contents in numerous applications. The 2D-to-3D video conversion technologies are strongly required to resolve a lack of 3D videos during the period of transition to the full ripe 3D video era. In 2D-to-3D conversion methods, after the depth image of each scene in 2D video is estimated, stereoscopic video is synthesized using DIBR (Depth Image Based Rendering) technologies. This paper proposes a novel depth fusion algorithm that integrates multiple depth cues contained in 2D video to generate stereoscopic video. For the proper depth fusion, it is checked whether some cues are reliable or not in current scene. Based on the result of the reliability tests, current scene is classified into one of 4 scene types and scene-adaptive depth fusion is applied to combine those reliable depth cues to generate the final depth information. Simulation results show that each depth cue is reasonably utilized according to scene types and final depth is generated by cues which can effectively represent the current scene.

An Efficient Video Retrieval Algorithm Using Key Frame Matching for Video Content Management

  • Kim, Sang Hyun
    • International Journal of Contents
    • /
    • v.12 no.1
    • /
    • pp.1-5
    • /
    • 2016
  • To manipulate large video contents, effective video indexing and retrieval are required. A large number of video indexing and retrieval algorithms have been presented for frame-wise user query or video content query whereas a relatively few video sequence matching algorithms have been proposed for video sequence query. In this paper, we propose an efficient algorithm that extracts key frames using color histograms and matches the video sequences using edge features. To effectively match video sequences with a low computational load, we make use of the key frames extracted by the cumulative measure and the distance between key frames, and compare two sets of key frames using the modified Hausdorff distance. Experimental results with real sequence show that the proposed video sequence matching algorithm using edge features yields the higher accuracy and performance than conventional methods such as histogram difference, Euclidean metric, Battachaya distance, and directed divergence methods.

POSE-VIWEPOINT ADAPTIVE OBJECT TRACKING VIA ONLINE LEARNING APPROACH

  • Mariappan, Vinayagam;Kim, Hyung-O;Lee, Minwoo;Cho, Juphil;Cha, Jaesang
    • International journal of advanced smart convergence
    • /
    • v.4 no.2
    • /
    • pp.20-28
    • /
    • 2015
  • In this paper, we propose an effective tracking algorithm with an appearance model based on features extracted from a video frame with posture variation and camera view point adaptation by employing the non-adaptive random projections that preserve the structure of the image feature space of objects. The existing online tracking algorithms update models with features from recent video frames and the numerous issues remain to be addressed despite on the improvement in tracking. The data-dependent adaptive appearance models often encounter the drift problems because the online algorithms does not get the required amount of data for online learning. So, we propose an effective tracking algorithm with an appearance model based on features extracted from a video frame.

Analysis of Components Performance for Programmable Video Decoder (프로그래머블 비디오 복호화기를 위한 구성요소의 성능 분석)

  • Kim, Jaehyun;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.24 no.1
    • /
    • pp.182-185
    • /
    • 2019
  • This paper analyzes performances of modules in implementing a programmable multi-format video decoder. The goal of the proposed platform is the high-end Full High Definition (FHD) video decoder. The proposed multi-format video decoder consists of a reconfigurable processor, dedicated bit-stream co-processor, memory controller, cache for motion compensation, and flexible hardware accelerators. The experiments suggest performance baseline of modules for the proposed architecture operating at 300 MHz clock with capability of decoding HEVC bit-streams of FHD 30 frames per second.

A Study on the Remediation of Game Culture in Video Media (영상미디어의 게임문화 재매개 양상 연구)

  • Lee, Dong-Eun
    • Journal of Korea Game Society
    • /
    • v.21 no.6
    • /
    • pp.97-110
    • /
    • 2021
  • The purpose of this study is to analyze how video media such as movies and dramas, which aim for one-way and viewing-oriented content, remediates the attributes of games, and to study the influence and convergence of games and video media. There were two main types of game culture convergence patterns of video media. The first is the passive fusion of the content aspect of the game, and the second is the active fusion of the system aspect of the game. Such research is significant as a basic study that reveals that games affect various video media and proves that game media can function as a representative media leading popular culture.

Hierarchical Modulation Scheme for 3D Stereoscopic Video Transmission Over Maritime Channel Environment (해양 채널 환경에서 3D 입체영상의 전송을 위한 계층변조 기법)

  • You, Dongho;Lee, Seong Ro;Kim, Dong Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.7
    • /
    • pp.1405-1412
    • /
    • 2015
  • Recently, Due to the rapid growth of broadcasting communication and video coding technologies, the demands for immersive media contents based on 3D stereoscopic video will increase steadily. And the demands must ultimately provide the contents for users which are in wireless channel such as vehicle, train, and ship. Thus, in this paper, we transmit the 3D stereoscopic video over the maritime Rician channel that direct wave is more dominant than reflective wave. Besides, we present unequel error protection (UEP) by applying hierarchical 4/16-QAM to V+D(Video plus Depth) format which can represent 3D stereoscopic video. We expect our system to provide seamless broadcasting service for users with poor reception condition.

Programmable Multimedia Platform for Video Processing of UHD TV (UHD TV 영상신호처리를 위한 프로그래머블 멀티미디어 플랫폼)

  • Kim, Jaehyun;Park, Goo-man
    • Journal of Broadcast Engineering
    • /
    • v.20 no.5
    • /
    • pp.774-777
    • /
    • 2015
  • This paper introduces the world's first programmable video-processing platform for the enhancement of the video quality of the 8K(7680x4320) UHD(Ultra High Definition) TV operating up to 60 frames per second. In order to support required computing capacity and memory bandwidth, the proposed platform implemented several key features such as symmetric multi-cluster architecture for parallel data processing, a ring-data path between the clusters for data pipelining and hardware accelerators for computing filter operations. The proposed platform based on RP(Reconfigurable Processor) processes video quality enhancement algorithms and handles effectively new UHD broadcasting standards and display panels.

Probabilistic Background Subtraction in a Video-based Recognition System

  • Lee, Hee-Sung;Hong, Sung-Jun;Kim, Eun-Tai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.4
    • /
    • pp.782-804
    • /
    • 2011
  • In video-based recognition systems, stationary cameras are used to monitor an area of interest. These systems focus on a segmentation of the foreground in the video stream and the recognition of the events occurring in that area. The usual approach to discriminating the foreground from the video sequence is background subtraction. This paper presents a novel background subtraction method based on a probabilistic approach. We represent the posterior probability of the foreground based on the current image and all past images and derive an updated method. Furthermore, we present an efficient fusion method for the color and edge information in order to overcome the difficulties of existing background subtraction methods that use only color information. The suggested method is applied to synthetic data and real video streams, and its robust performance is demonstrated through experimentation.