• Title/Summary/Keyword: video fusion

Search Result 118, Processing Time 0.019 seconds

A Study on convergence video system trough Floating Hologram (플로팅 홀로그램을 통한 융복합 영상시스템 연구)

  • Oh, Seung-Hwan
    • Journal of Digital Convergence
    • /
    • v.18 no.10
    • /
    • pp.397-402
    • /
    • 2020
  • Hologram can be categorized into analog and digital hologram but there's a clear limitation in expensive equipment and content realization for ordinary people to realize. In addition, it's required to conduct study on hologram contents with interaction added, escaping out of exiting stable format like endlessly repetitive contents or passive view through specific video. Therefore, this article aims to suggest fusion image system, especially focusing on floating hologram among similar holograms. Eight elements of hologram interaction are as follows: height of camera in a three-dimensional space, interval between 3D model, overlapped model, scale, animation, position, color and 3D model change. For the floating hologram, the audience can control by themselves in real time, the popular, active hologram contents-making methodology is suggested by making the best use of fusion image system and making floating hologram easily without using expensive hologram equipment. The image system developed in actual exhibition and feedback should be complemented to develop better hologram image system.

Pattern-based Depth Map Generation for Low-complexity 2D-to-3D Video Conversion (저복잡도 2D-to-3D 비디오 변환을 위한 패턴기반의 깊이 생성 알고리즘)

  • Han, Chan-Hee;Kang, Hyun-Soo;Lee, Si-Woong
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.2
    • /
    • pp.31-39
    • /
    • 2015
  • 2D-to-3D video conversion vests 3D effects in a 2D video by generating stereoscopic views using depth cues inherent in the 2D video. This technology would be a good solution to resolve the problem of 3D content shortage during the transition period to the full ripe 3D video era. In this paper, a low-complexity depth generation method for 2D-to-3D video conversion is presented. For temporal consistency in global depth, a pattern-based depth generation method is newly introduced. A low-complexity refinement algorithm for local depth is also provided to improve 3D perception in object regions. Experimental results show that the proposed method outperforms conventional methods in terms of complexity and subjective quality.

Multimodal Biometrics Recognition from Facial Video with Missing Modalities Using Deep Learning

  • Maity, Sayan;Abdel-Mottaleb, Mohamed;Asfour, Shihab S.
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.6-29
    • /
    • 2020
  • Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics. In this paper, we present a novel multimodal recognition system that trains a deep learning network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-redundant features. The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Moreover, the proposed technique has proven robust when some of the above modalities were missing during the testing. The proposed system has three main components that are responsible for detection, which consists of modality specific detectors to automatically detect images of different modalities present in facial video clips; feature selection, which uses supervised denoising sparse auto-encoders network to capture discriminative representations that are robust to the illumination and pose variations; and classification, which consists of a set of modality specific sparse representation classifiers for unimodal recognition, followed by score level fusion of the recognition results of the available modalities. Experiments conducted on the constrained facial video dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% Rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips even in the situation of missing modalities.

GRAYSCALE IMAGE COLORIZATION USING A CONVOLUTIONAL NEURAL NETWORK

  • JWA, MINJE;KANG, MYUNGJOO
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.25 no.2
    • /
    • pp.26-38
    • /
    • 2021
  • Image coloration refers to adding plausible colors to a grayscale image or video. Image coloration has been used in many modern fields, including restoring old photographs, as well as reducing the time spent painting cartoons. In this paper, a method is proposed for colorizing grayscale images using a convolutional neural network. We propose an encoder-decoder model, adapting FusionNet to our purpose. A proper loss function is defined instead of the MSE loss function to suit the purpose of coloring. The proposed model was verified using the ImageNet dataset. We quantitatively compared several colorization models with ours, using the peak signal-to-noise ratio (PSNR) metric. In addition, to qualitatively evaluate the results, our model was applied to images in the test dataset and compared to images applied to various other models. Finally, we applied our model to a selection of old black and white photographs.

An integrated visual-inertial technique for structural displacement and velocity measurement

  • Chang, C.C.;Xiao, X.H.
    • Smart Structures and Systems
    • /
    • v.6 no.9
    • /
    • pp.1025-1039
    • /
    • 2010
  • Measuring displacement response for civil structures is very important for assessing their performance, safety and integrity. Recently, video-based techniques that utilize low-cost high-resolution digital cameras have been developed for such an application. These techniques however have relatively low sampling frequency and the results are usually contaminated with noises. In this study, an integrated visual-inertial measurement method that combines a monocular videogrammetric displacement measurement technique and a collocated accelerometer is proposed for displacement and velocity measurement of civil engineering structures. The monocular videogrammetric technique extracts three-dimensional translation and rotation of a planar target from an image sequence recorded by one camera. The obtained displacement is then fused with acceleration measured from a collocated accelerometer using a multi-rate Kalman filter with smoothing technique. This data fusion not only can improve the accuracy and the frequency bandwidth of displacement measurement but also provide estimate for velocity. The proposed measurement technique is illustrated by a shake table test and a pedestrian bridge test. Results show that the fusion of displacement and acceleration can mitigate their respective limitations and produce more accurate displacement and velocity responses with a broader frequency bandwidth.

An Identification Method of Detrimental Video Images Using Color Space Features (컬러공간 특성을 이용한 유해 동영상 식별방법에 관한 연구)

  • Kim, Soung-Gyun;Kim, Chang-Geun;Jeong, Dae-Yul
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.6
    • /
    • pp.2807-2814
    • /
    • 2011
  • This paper proposes an identification algorithm that detects detrimental digital video contents based on the color space features. In this paper, discrimination algorithm based on a 2-Dimensional Projection Maps is suggested to find targeted video images. First, 2-Dimensional Projection Maps which is extracting the color characteristics of the video images is applied to extract effectively detrimental candidate frames from the videos, and next estimates similarity between the extracted frames and normative images using the suggested algorithm. Then the detrimental candidate frames are selected from the result of similarity evaluation test which uses critical value. In our experimental test, it is suggested that the results of the comparison between the Color Histogram and the 2-Dimensional Projection Maps technique to detect detrimental candidate frames. Through the various experimental data to test the suggested method and the similarity algorithm, detecting method based on the 2-Dimensional Projection Maps show more superior performance than using the Color Histogram technique in calculation speed and identification abilities searching target video images.

A study on F8L10D-N LoRa RF Module for Drone Based live Broadcasting system

  • Mfitumukiza, Joseph;Mariappan, Vinayagam;Lee, Minwoo;Cho, Juphil;Cha, Jaesang
    • International Journal of Advanced Culture Technology
    • /
    • v.4 no.4
    • /
    • pp.1-5
    • /
    • 2016
  • In this paper, we present the study on the proposed design of a real-time transmission of a video from the drone to broadcasting station (OBVan) by using F8L10D-N LoRa Module. Nowadays, LoRa technology is proved to be the mass of low cost, long range machine-to-machine connectivity. Particularly in the field of broadcasting and communication system, F8L10D-N LoRa RF Module spread spectrum technology with long transmission distance and strong penetrative ability that is double stronger than traditional FSK as well as PSK modulation scheme.

IP Studio Infrastructure intended for Modern Production and TV broadcasting Facilities

  • Mfitumukiza, Joseph;Mariappan, Vinayagam;Lee, Minwoo;Lee, Seungyoun;Lee, Junghoon;Lee, Juyoung;Lim, Yunsik;Cha, Jaesang
    • International journal of advanced smart convergence
    • /
    • v.5 no.3
    • /
    • pp.61-65
    • /
    • 2016
  • In the TV broadcasting, movie production and business the transportation of video between creators (programmers, studios) and distributors (broadcast and cable networks, cable and satellites companies) is still a mix of File Transfer Protocol (FTP), physical delivery, and expensive multicast satellite. Cloud-based file sync-and-share providers like Dropbox and box are playing an increasing role, but the industry's unique demands for speed and multicasting have fueled the growth of IP Video transport. This paper gives a solid grasp of the major elements of IP video technology, including content preparation, system architecture alternatives and network performance management.

Visual Fatigue Prediction for Stereoscopic Video Considering Individual Fusional Characteristics (시청자의 입체시 특성을 고려한 3D 비디오의 피로도 예측)

  • Kim, Dong-Hyun;Choi, Sung-Hwan;Sohn, Kwang-Hoon
    • Journal of Broadcast Engineering
    • /
    • v.16 no.2
    • /
    • pp.331-338
    • /
    • 2011
  • In this paper, we propose a visual fatigue prediction metric which considers individual fusional characteristics for stereoscopic video. It predicts the visual fatigue level by examining the disparity and motion characteristics of 3D videos. In addition, we classified the viewers into 2 groups according to fusional limit and the slope of fusional response which are acquired from random dot stereogram test. Then, Pearson's and Spearman's correlation coefficient was measured between the proposed metrics and the subjective results, acquiring 80% and 79%.

Video Representation via Fusion of Static and Motion Features Applied to Human Activity Recognition

  • Arif, Sheeraz;Wang, Jing;Fei, Zesong;Hussain, Fida
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3599-3619
    • /
    • 2019
  • In human activity recognition system both static and motion information play crucial role for efficient and competitive results. Most of the existing methods are insufficient to extract video features and unable to investigate the level of contribution of both (Static and Motion) components. Our work highlights this problem and proposes Static-Motion fused features descriptor (SMFD), which intelligently leverages both static and motion features in the form of descriptor. First, static features are learned by two-stream 3D convolutional neural network. Second, trajectories are extracted by tracking key points and only those trajectories have been selected which are located in central region of the original video frame in order to to reduce irrelevant background trajectories as well computational complexity. Then, shape and motion descriptors are obtained along with key points by using SIFT flow. Next, cholesky transformation is introduced to fuse static and motion feature vectors to guarantee the equal contribution of all descriptors. Finally, Long Short-Term Memory (LSTM) network is utilized to discover long-term temporal dependencies and final prediction. To confirm the effectiveness of the proposed approach, extensive experiments have been conducted on three well-known datasets i.e. UCF101, HMDB51 and YouTube. Findings shows that the resulting recognition system is on par with state-of-the-art methods.