• Title/Summary/Keyword: 3D video

Search Result 1,152, Processing Time 0.027 seconds

Scalable Multi-view Video Coding based on HEVC

  • Lim, Woong;Nam, Junghak;Sim, Donggyu
    • IEIE Transactions on Smart Processing and Computing
    • /
    • v.4 no.6
    • /
    • pp.434-442
    • /
    • 2015
  • In this paper, we propose an integrated spatial and view scalable video codec based on high efficiency video coding (HEVC). The proposed video codec is developed based on similarity and uniqueness between the scalable extension and 3D multi-view extension of HEVC. To improve compression efficiency using the proposed scalable multi-view video codec, inter-layer and inter-view predictions are jointly employed by using high-level syntaxes that are defined to identify view and layer information. For the inter-view and inter-layer predictions, a decoded picture buffer (DPB) management algorithm is also proposed. The inter-view and inter-layer motion predictions are integrated into a consolidated prediction by harmonizing with the temporal motion prediction of HEVC. We found that the proposed scalable multi-view codec achieves bitrate reduction of 36.1%, 31.6% and 15.8% on the top of ${\times}2$, ${\times}1.5$ parallel scalable codec and parallel multi-view codec, respectively.

Standard Technology for Digital Cable Stereoscopic 3DTV Broadcasting (디지털 케이블 양얀식 3DTV 방송 표준 기술)

  • You, Woong-Shik;Lee, Bong-Ho;Jung, Joon-Young;Yun, Kug-Jin;Choi, Dong-Joon;Cheong, Won-Sik;Hur, Nam-Ho;Kwon, Oh-Seok
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.9B
    • /
    • pp.1126-1142
    • /
    • 2011
  • This paper addresses the stereoscopic 3D broadcasting technology that delivers the 3DTV contents through the digital cable networks. In order to convey the 3D contents via DCA TV network, specifications of 3D video format, compression, multiplexing, signalling and transport are to be developed. Since 3D has some constraints unlike 2D, it is required to be well designed by considering the capacity of the additional view and the backward/forward compatibility. This paper goes with the latest trends of 3D standard, requirements and service scenarios and then covers the 3D format, compression, multiplexing and signaling, service information and transport/reception technologies.

Intra Prediction Information Skip using Analysis of Adjacent Pixels for H.264/AVC (인접 화소 성분 분석을 이용한 H.264/AVC에서의 Intra 예측 정보 생략)

  • Kim, Dae-Yeon;Kim, Dong-Kyun;Lee, Yung-Lyul
    • Journal of Broadcast Engineering
    • /
    • v.14 no.3
    • /
    • pp.271-279
    • /
    • 2009
  • The Moving Picture Experts Group (MPEG) and Video Coding Experts Group (VCEG) have developed a new standard that promises to outperform the earlier MPEG-4 and H.263 standards. The new standard is called H.264/AVC (Advanced Video Coding) and is published jointly as MPEG-4 Part 10 and ITU-T Recommendation H.264. In particular, the H.264/AVC intra prediction coding provides nine directional prediction modes for every $4{\times}4$ block in order to reduce spatial redundancies. In this paper, an ABS (Adaptive Bit Skip) mode is proposed. In order to achieve coding efficiency, the proposed method can remove the mode bits to represent the prediction mode by using the similarity of adjacent pixels. Experimental results show that the proposed method achieves the PSNR gain of about 0.2 dB in R-D curve and reduces the bit rates about 3.6% compared with H.264/AVC.

A Study on Recognition of Dangerous Behaviors using Privacy Protection Video in Single-person Household Environments

  • Lim, ChaeHyun;Kim, Myung Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.47-54
    • /
    • 2022
  • Recently, with the development of deep learning technology, research on recognizing human behavior is in progress. In this paper, a study was conducted to recognize risky behaviors that may occur in a single-person household environment using deep learning technology. Due to the nature of single-person households, personal privacy protection is necessary. In this paper, we recognize human dangerous behavior in privacy protection video with Gaussian blur filters for privacy protection of individuals. The dangerous behavior recognition method uses the YOLOv5 model to detect and preprocess human object from video, and then uses it as an input value for the behavior recognition model to recognize dangerous behavior. The experiments used ResNet3D, I3D, and SlowFast models, and the experimental results show that the SlowFast model achieved the highest accuracy of 95.7% in privacy-protected video. Through this, it is possible to recognize human dangerous behavior in a single-person household environment while protecting individual privacy.

A Probabilistic Network for Facial Feature Verification

  • Choi, Kyoung-Ho;Yoo, Jae-Joon;Hwang, Tae-Hyun;Park, Jong-Hyun;Lee, Jong-Hoon
    • ETRI Journal
    • /
    • v.25 no.2
    • /
    • pp.140-143
    • /
    • 2003
  • In this paper, we present a probabilistic approach to determining whether extracted facial features from a video sequence are appropriate for creating a 3D face model. In our approach, the distance between two feature points selected from the MPEG-4 facial object is defined as a random variable for each node of a probability network. To avoid generating an unnatural or non-realistic 3D face model, automatically extracted 2D facial features from a video sequence are fed into the proposed probabilistic network before a corresponding 3D face model is built. Simulation results show that the proposed probabilistic network can be used as a quality control agent to verify the correctness of extracted facial features.

  • PDF

MPEG-DASH based 3D Point Cloud Content Configuration Method (MPEG-DASH 기반 3차원 포인트 클라우드 콘텐츠 구성 방안)

  • Kim, Doohwan;Im, Jiheon;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.24 no.4
    • /
    • pp.660-669
    • /
    • 2019
  • Recently, with the development of three-dimensional scanning devices and multi-dimensional array cameras, research is continuously conducted on techniques for handling three-dimensional data in application fields such as AR (Augmented Reality) / VR (Virtual Reality) and autonomous traveling. In particular, in the AR / VR field, content that expresses 3D video as point data has appeared, but this requires a larger amount of data than conventional 2D images. Therefore, in order to serve 3D point cloud content to users, various technological developments such as highly efficient encoding / decoding and storage, transfer, etc. are required. In this paper, V-PCC bit stream created using V-PCC encoder proposed in MPEG-I (MPEG-Immersive) V-PCC (Video based Point Cloud Compression) group, It is defined by the MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard, and provides to be composed of segments. Also, in order to provide the user with the information of the 3D coordinate system, the depth information parameter of the signaling message is additionally defined. Then, we design a verification platform to verify the technology proposed in this paper, and confirm it in terms of the algorithm of the proposed technology.

Point Cloud Video Codec using 3D DCT based Motion Estimation and Motion Compensation (3D DCT를 활용한 포인트 클라우드의 움직임 예측 및 보상 기법)

  • Lee, Minseok;Kim, Boyeun;Yoon, Sangeun;Hwang, Yonghae;Kim, Junsik;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.26 no.6
    • /
    • pp.680-691
    • /
    • 2021
  • Due to the recent developments of attaining 3D contents by using devices such as 3D scanners, the diversity of the contents being used in AR(Augmented Reality)/VR(Virutal Reality) fields is significantly increasing. There are several ways to represent 3D data, and using point clouds is one of them. A point cloud is a cluster of points, having the advantage of being able to attain actual 3D data with high precision. However, in order to express 3D contents, much more data is required compared to that of 2D images. The size of data needed to represent dynamic 3D point cloud objects that consists of multiple frames is especially big, and that is why an efficient compression technology for this kind of data must be developed. In this paper, a motion estimation and compensation method for dynamic point cloud objects using 3D DCT is proposed. This will lead to switching the 3D video frames into I frames and P frames, which ensures higher compression ratio. Then, we confirm the compression efficiency of the proposed technology by comparing it with the anchor technology, an Intra-frame based compression method, and 2D-DCT based V-PCC.

Fast Mode Decision using Global Disparity Vector for Multi-view Video Coding (다시점 영상 부호화에서 전역 변이 벡터를 이용한 고속 모드 결정)

  • Han, Dong-Hoon;Cho, Suk-Hee;Hur, Nam-Ho;Lee, Yung-Lyul
    • Journal of Broadcast Engineering
    • /
    • v.13 no.3
    • /
    • pp.328-338
    • /
    • 2008
  • Multi-view video coding (MVC) based on H.264/AVC encodes multiple views efficiently by using a prediction scheme that exploits inter-view correlation among multiple views. However, with the increase of the number of views and use of inter-view prediction among views, total encoding time will be increased in multiview video coding. In this paper, we propose a fast mode decision using both MB(Macroblock)-based region segmentation information corresponding to each view in multiple views and global disparity vector among views in order to reduce encoding time. The proposed method achieves on average 40% reduction of total encoding time with the objective video quality degradation of about 0.04 dB peak signal-to-noise ratio (PSNR) by using joint multi-view video model (JMVM) 4.0 that is the reference software of the multiview video coding standard.

Adaptive Multi-view Video Service Framework for Mobile Environments (이동 환경을 위한 적응형 다시점 비디오 서비스 프레임워크)

  • Kwon, Jun-Sup;Kim, Man-Bae;Choi, Chang-Yeol
    • Journal of Broadcast Engineering
    • /
    • v.13 no.5
    • /
    • pp.586-595
    • /
    • 2008
  • In this paper, we propose an adaptive multi-view video service framework suitable for mobile environments. The proposed framework generates intermediate views in near-realtime and overcomes the limitations of mobile services by adapting the multi-view video according to the processing capability of a mobile device as well as the user characteristics of a client. By implementing the most of adaptation processes at the server side, the load on a client can be reduced. H.264/AVC is adopted as a compression scheme. The framework could provide an interactive service with efficient video service to a mobile client. For this, we present a multi-view video DIA (Digital Item Adaptation) that adapts the multi-view video according to the MPEG-21 DIA multimedia framework. Experimental results show that our proposed system can support a frame rate of 13 fps for 320{\times}240 video and reduce the time of generating an intermediate view by 20 % compared with a conventional 3D projection method.

An Efficient Video Watermarking Using Re-Estimation and Minimum Modification Technique of Motion Vectors (재예측과 움직임벡터의 변경 최소화 기법을 이용한 효율적인 비디오 워터마킹)

  • Kang Kyung-won;Moon Kwang-seok;Kim Jong-nam
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.6C
    • /
    • pp.497-504
    • /
    • 2005
  • We propose an efficient video watermarking scheme using re-estimation and minimum modification technique of motion vectors. Conventional methods based on motion vectors do watermarking using modification of motion vectors. However, change of motion vectors results in the degradation of video quality. Thus, our scheme minimizes the modification of the original motion vectors and replaces an original motion vector by the adjacent optimal motion vector using re-estimation of motion vectors to avoid degradation of video quality. Besides, our scheme guarantees the amount of embedded watermark data using the adaptive threshold considering for an efficient video watermarking. In addition, this is compatible with current video compression standards without changing the bitstream. Experimental result shows that the proposed scheme obtains better video quality than other previous algorithms by about $0.6{\sim}1.3\;dB$.