• Title/Summary/Keyword: VIDEO ENCODER

Search Result 447, Processing Time 0.025 seconds

Multimodal Biometrics Recognition from Facial Video with Missing Modalities Using Deep Learning

  • Maity, Sayan;Abdel-Mottaleb, Mohamed;Asfour, Shihab S.
    • Journal of Information Processing Systems
    • /
    • v.16 no.1
    • /
    • pp.6-29
    • /
    • 2020
  • Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics. In this paper, we present a novel multimodal recognition system that trains a deep learning network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-redundant features. The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Moreover, the proposed technique has proven robust when some of the above modalities were missing during the testing. The proposed system has three main components that are responsible for detection, which consists of modality specific detectors to automatically detect images of different modalities present in facial video clips; feature selection, which uses supervised denoising sparse auto-encoders network to capture discriminative representations that are robust to the illumination and pose variations; and classification, which consists of a set of modality specific sparse representation classifiers for unimodal recognition, followed by score level fusion of the recognition results of the available modalities. Experiments conducted on the constrained facial video dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% Rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips even in the situation of missing modalities.

A Deep Learning-Based Rate Control for HEVC Intra Coding

  • Marzuki, Ismail;Sim, Donggyu
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2019.11a
    • /
    • pp.180-181
    • /
    • 2019
  • This paper proposes a rate control algorithm for intra coding frame in HEVC encoder using a deep learning approach. The proposed algorithm is designed for CTU level bit allocation in intra frame by considering visual features spatially and temporally. Our features are generated using visual geometry group (VGG-16) with deep convolutional layers, then it is used for bit allocation per each CTU within an intra frame. According to our experiments, the proposed algorithm can achieve -2.04% Luma component BD-rate gain with minimal bit accuracy loss against the HM-16.20 rate control model.

  • PDF

Efficient MPEG-4 to H.264/AVC Transcoding with Spatial Downscaling

  • Nguyen, Toan Dinh;Lee, Guee-Sang;Chang, June-Young;Cho, Han-Jin
    • ETRI Journal
    • /
    • v.29 no.6
    • /
    • pp.826-828
    • /
    • 2007
  • Efficient downscaling in a transcoder is important when the output should be converted to a lower resolution video. In this letter, we suggest an efficient algorithm for transcoding from MPEG-4 SP (with simple profile) to H.264/AVC with spatial downscaling. First, target image blocks are classified into monotonous, complex, and very complex regions for fast mode decision. Second, adaptive search ranges are applied to these image classes for fast motion estimation in an H.264/AVC encoder with predicted motion vectors. Simulation results show that our transcoder considerably reduces transcoding time while video quality is kept almost optimal.

  • PDF

ROBUST TRANSMISSION OF VIDEO DATA STREAM OVER WIRELESS NETWORK BASED ON HIERARCHICAL SYNCHRONIZATION

  • Jung, Han-Seung;Kim, Rin-Chul;Lee, Sang-Uk
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1998.06b
    • /
    • pp.5-9
    • /
    • 1998
  • In this paper, we propose an error-resilient transmission technique for the H.263 video data stream over wireless networks. The proposed algorithm employs bit rearrangement hierarchically, providing the robust and exact synchronization against the bit errors, without requiring extra redundant information. In addition, we propose the recovery algorithm for the lost or erroneous motion vectors. We implement the encoder and decoder, based on the H.263 standard, and evaluate the proposed algorithm through intensive computer simulation. The experimental results demonstrate that the proposed algorithm yields good image quality, in spite of the channel errors, and prevents the error propagation both in the spatial and the temporal domain efficiently.

  • PDF

Error Resilient MPEG-4 Encoding Method (오류 내성을 갖는 MPEG-4 부호화 기법)

  • 현기수;문지용;김기두;강동욱
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2002.11a
    • /
    • pp.105-109
    • /
    • 2002
  • The main ideas of hybrid video coding methods are to reduce the spatial and temporal redundancy for efficient data compression. If compressed video stream is transmitted through the error-prone channel, bitstream can be critically damaged and the spatio-temporal error propagates through successive frames at the decoder because of drift noise in the references between encoder and decoder. In this paper, I propose the lagrangian multiplier selection method in the error-prone environment. Finally, it is shown that the performance comparisons of the R-D optimized mode decision are made against the conventional method and simulation results are given in the following.

  • PDF

Complexity Reduction Method for SVC Encoder Adopting Large Block (Large Block 을 적용한 SVC 부호화기의 복잡도 감소 기법)

  • Park, Un-Ki;Kim, Jae-Gon;Jeong, Dae-Gwon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.11a
    • /
    • pp.267-269
    • /
    • 2011
  • 본 논문에서는 기존에 제시된 LB 기법을 이용할 경우 발생하는 증가한 복잡도를 감소 시키는 기법을 다룬다. LB(Large Block)는 HEVC(High Efficiency Video Coding)의 대표적인 부호화 툴로 H.264/SVC(Scalable Video Coding)에 적용한 경우에도 상당한 부호화 효율의 개선을 보인다. 그러나, LB 를 적용하면 매크로블록 레벨에서 추가적인 부호화 과정이 요구되므로, 부호화기의 복잡도가 증가하는 문제점이 발생한다. 따라서, 본 논문에서는 LB 를 적용한 SVC 의 부호화기 복잡도를 감소시키는 기법을 제시하고 모의실험을 통해 제안된 부호화 기법의 RD 성능과 부호화기 복잡도를 확인한다. 실험결과 제안된 기법을 사용한 부호화 기법이 기존의 부호화 방법에 비해, RD 측면에서 평균 1.8%의 미미한 손실이 있으나, 복잡도 측면에서 평균 12%의 이득이 있음을 확인 하였다.

  • PDF

Efficient Transform Coefficient Coding for the HEVC Intra Frame Coder (HEVC 화면내 부호기를 위한 효율적인 변환 계수 부호화 방법)

  • Choi, Jung A;Ho, Yo Sung
    • Smart Media Journal
    • /
    • v.1 no.2
    • /
    • pp.6-11
    • /
    • 2012
  • In the HEVC standard, transform coefficient coding that affects the output bitstream directly is a core part of the encoder and it includes coefficient scanning and entropy coding. Recently, JCT-VC(Joint Collaborative Team on Video Coding) advances to HEVC Committee Draft (CD). In this paper, we explain HEVC transform coefficient coding and propose an efficient transform coefficient coding method considering statistics of transform coefficients in the intra frame coder. The proposed method reduces BD-Rate by up to 0.74%, compared to the conventional HEVC transform coefficient coding.

  • PDF

Equal Bit Rate Control for Low Bit-Rate Coder by Using Frame Statistics (확률 분포를 고려한 저 전송률 비디오 부호기의 균등 비트 할당 기법 연구)

  • 한성욱;서동완;최윤식
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.29-32
    • /
    • 2002
  • In typical block-based video coding, the objective of RC(Rate Control) is to select the quantization parameters so that the encoder produces bits at the rate of the channel and the overall distortion is minimized. To reduce the huge amount of computations required for offline RC, there have been significant efforts to speed up the process of video encoders. Those efforts have been mainly focused on the modes for bit rate and distortion in types of coders, in terms of the quantization parameters. Because previous works related to model based online RC are based on statistics of previous frame, it occurs the problem such that allocates bits unequally without regard to current frame statistics. In this thesis, an equal bit allocation scheme using current frame statistics is proposed.

  • PDF

Real-Time Implementation of Speech Vocoder For Video Telephony (화상 전화용 음성 보코더의 실시간 구현)

  • Nam, Il-Ryong;Seo, Sung-Dae;Nam, Hyun-Do
    • Proceedings of the KIEE Conference
    • /
    • 1998.07g
    • /
    • pp.2414-2416
    • /
    • 1998
  • This paper presents real-time implementation of speech vocoder for PSTN video telephony using ITU G.723 16Kbps ADPCM algorithm. The ADPCM encoder accepts 8-bit PCM compressed signals and expends it to a 14-bit-per-sample. The predicted values are subtracted from encoded signals to produce difference signals. Adaptive quantization is performed on the difference signal to produce a 2-bit, output for transmission over the channel. Computer simulations and experiments were performed to evaluate the performance of the speech vocoder.

  • PDF

The design of Stream producer for MPEG-4 encoder (MPEG-­4 부호화기를 위한 스트림 생성기 설계)

  • 송인근;서기범
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.8
    • /
    • pp.1776-1784
    • /
    • 2003
  • In this paper, we propose the architecture of stream producer for MPEG­4 Video encoding. This module receives the quantized coefficient from DCT and Quantization module in macroblock unit and performs the VLC coding according to the encoding mode, and supports the error concealment mode of MPEG­4 and data partitioning mode. Using the VHDL, we designed the module using this architecture and performed the evaluations of this module by performing the post­-simulation.