• Title/Summary/Keyword: Depth Video

Search Result 450, Processing Time 0.029 seconds

Performance Analysis of 3D-HEVC Video Coding (3D-HEVC 비디오 부호화 성능 분석)

  • Park, Daemin;Choi, Haechul
    • Journal of Broadcast Engineering
    • /
    • v.19 no.5
    • /
    • pp.713-725
    • /
    • 2014
  • Multi-view and 3D video technologies for a next generation video service are widely studied. These technologies can make users feel realistic experience as supporting various views. Because acquisition and transmission of a large number of views require a high cost, main challenges for multi-view and 3D video include view synthesis, video coding, and depth coding. Recently, JCT-3V (joint collaborative team on 3D video coding extension development) has being developed a new standard for multi-view and 3D video. In this paper, major tools adopted in this standard are introduced and evaluated in terms of coding efficiency and complexity. This performance analysis would be helpful for the development of a fast 3D video encoder as well as a new 3D video coding algorithm.

Temporal Anti-aliasing of a Stereoscopic 3D Video

  • Kim, Wook-Joong;Kim, Seong-Dae;Hur, Nam-Ho;Kim, Jin-Woong
    • ETRI Journal
    • /
    • v.31 no.1
    • /
    • pp.1-9
    • /
    • 2009
  • Frequency domain analysis is a fundamental procedure for understanding the characteristics of visual data. Several studies have been conducted with 2D videos, but analysis of stereoscopic 3D videos is rarely carried out. In this paper, we derive the Fourier transform of a simplified 3D video signal and analyze how a 3D video is influenced by disparity and motion in terms of temporal aliasing. It is already known that object motion affects temporal frequency characteristics of a time-varying image sequence. In our analysis, we show that a 3D video is influenced not only by motion but also by disparity. Based on this conclusion, we present a temporal anti-aliasing filter for a 3D video. Since the human process of depth perception mainly determines the quality of a reproduced 3D image, 2D image processing techniques are not directly applicable to 3D images. The analysis presented in this paper will be useful for reducing undesirable visual artifacts in 3D video as well as for assisting the development of relevant technologies.

  • PDF

Fast Stereoscopic 3D Broadcasting System using x264 and GPU (x264와 GPU를 이용한 고속 양안식 3차원 방송 시스템)

  • Choi, Jung-Ah;Shin, In-Yong;Ho, Yo-Sung
    • Journal of Broadcast Engineering
    • /
    • v.15 no.4
    • /
    • pp.540-546
    • /
    • 2010
  • Since the stereoscopic 3-dimensional (3D) video that provides users with a realistic multimedia service requires twice as much data as 2-dimensional (2D) video, it is difficult to construct the fast system. In this paper, we propose a fast stereoscopic 3D broadcasting system based on the depth information. Before the transmission, we encode the input 2D+depth video using x264, an open source H.264/AVC fast encoder to reduce the size of the data. At the receiver, we decode the transmitted bitstream in real time using a compute unified device architecture (CUDA) video decoder API on NVIDIA graphics processing unit (GPU). Then, we apply a fast view synthesis method that generates the virtual view using GPU. The proposed system can display the output video in both 2DTV and 3DTV. From the experiment, we verified that the proposed system can service the stereoscopic 3D contents in 24 frames per second at most.

Enhanced Sign Language Transcription System via Hand Tracking and Pose Estimation

  • Kim, Jung-Ho;Kim, Najoung;Park, Hancheol;Park, Jong C.
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.3
    • /
    • pp.95-101
    • /
    • 2016
  • In this study, we propose a new system for constructing parallel corpora for sign languages, which are generally under-resourced in comparison to spoken languages. In order to achieve scalability and accessibility regarding data collection and corpus construction, our system utilizes deep learning-based techniques and predicts depth information to perform pose estimation on hand information obtainable from video recordings by a single RGB camera. These estimated poses are then transcribed into expressions in SignWriting. We evaluate the accuracy of hand tracking and hand pose estimation modules of our system quantitatively, using the American Sign Language Image Dataset and the American Sign Language Lexicon Video Dataset. The evaluation results show that our transcription system has a high potential to be successfully employed in constructing a sizable sign language corpus using various types of video resources.

Wider Depth Dynamic Range Using Occupancy Map Correction for Immersive Video Coding (몰입형 비디오 부호화를 위한 점유맵 보정을 사용한 깊이의 동적 범위 확장)

  • Lim, Sung-Gyun;Hwang, Hyeon-Jong;Oh, Kwan-Jung;Jeong, Jun Young;Lee, Gwangsoon;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1213-1215
    • /
    • 2022
  • 몰입형 비디오 부호화를 위한 MIV(MPEG Immersive Video) 표준은 제한된 3D 공간의 다양한 위치의 뷰(view)들을 효율적으로 압축하여 사용자에게 임의의 위치 및 방향에 대한 6 자유도(6DoF)의 몰입감을 제공한다. MIV 의 참조 소프트웨어인 TMIV(Test Model for Immersive Video)에서는 복수의 뷰 간 중복되는 영역을 제거하여 전송할 화소수를 줄이기 때문에 복호화기에서 렌더링(rendering)을 위해서 각 화소의 점유(occupancy) 정보도 전송되어야 한다. TMIV 는 점유맵을 깊이(depth) 아틀라스(atlas)에 포함하여 압축 전송하고, 부호화 오류로 인한 점유 정보 손실을 방지하기 위해 깊이값 표현을 위한 동적 범위의 일부를 보호대역(guard band)으로 할당한다. 이 보호대역을 줄여서 더 넓은 깊이값의 동적 범위를 사용하면 렌더링 화질을 개선시킬 수 있다. 따라서, 본 논문에서는 현재 TMIV 의 점유 정보 오류 분석을 바탕으로 이를 보정하는 기법을 제시하고, 깊이 동적 범위 확장에 따른 부호화 성능을 분석한다. 제안기법은 기존의 TMIV 와 비교하여 평균 1.3%의 BD-rate 성능 향상을 보여준다.

  • PDF

3DTIP: 3D Stereoscopic Tour-Into-Picture of Korean Traditional Paintings (3DTIP: 한국 고전화의 3차원 입체 Tour-Into-Picture)

  • Jo, Cheol-Yong;Kim, Man-Bae
    • Journal of Broadcast Engineering
    • /
    • v.14 no.5
    • /
    • pp.616-624
    • /
    • 2009
  • This paper presents a 3D stereoscopic TIP (Tour Into Picture) for Korean classical paintings being composed of persons, boat, and landscape. Unlike conventional TIP methods providing 2D image or video, our proposed TIP can provide users with 3D stereoscopic contents. Navigating a picture with stereoscopic viewing can deliver more realistic and immersive perception. The method firstly makes input data being composed of foreground mask, background image, and depth map. The second step is to navigate the picture and to obtain rendered images by orthographic or perspective projection. Then, two depth enhancement schemes such as depth template and Laws depth are utilized in order to reduce a cardboard effect and thus to enhance 3D perceived depth of the foreground objects. In experiments, the proposed method was tested on 'Danopungjun' and 'Muyigido' that are famous paintings made in Chosun Dynasty. The stereoscopic animation was proved to deliver new 3D perception compared with 2D video.

Hybrid Down-Sampling Method of Depth Map Based on Moving Objects (움직임 객체 기반의 하이브리드 깊이 맵 다운샘플링 기법)

  • Kim, Tae-Woo;Kim, Jung Hun;Park, Myung Woo;Shin, Jitae
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37A no.11
    • /
    • pp.918-926
    • /
    • 2012
  • In 3D video transmission, a depth map being used for depth image based rendering (DIBR) is generally compressed by reducing resolution for coding efficiency. Errors in resolution reduction are recovered by an appropriate up-sampling method after decoding. However, most previous works only focus on up-sampling techniques to reduce errors. In this paper, we propose a novel down-sampling technique of depth map that applies different down-sampling rates on moving objects and background in order to enhance human perceptual quality. Experimental results demonstrate that the proposed scheme provides both higher visual quality and peak signal-to-noise ratio (PSNR). Also, our method is compatible with other up-sampling techniques.

RGB-Depth Camera for Dynamic Measurement of Liquid Sloshing (RGB-Depth 카메라를 활용한 유체 표면의 거동 계측분석)

  • Kim, Junhee;Yoo, Sae-Woung;Min, Kyung-Won
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.32 no.1
    • /
    • pp.29-35
    • /
    • 2019
  • In this paper, a low-cost dynamic measurement system using the RGB-depth camera, Microsoft $Kinect^{(R)}$ v2, is proposed for measuring time-varying free surface motion of liquid dampers used in building vibration mitigation. Various experimental studies are conducted consecutively: performance evaluation and validation of the $Kinect^{(R)}$ v2, real-time monitoring using the $Kinect^{(R)}$ v2 SDK(software development kits), point cloud acquisition of liquid free surface in the 3D space, comparison with the existing video sensing technology. Utilizing the proposed $Kinect^{(R)}$ v2-based measurement system in this study, dynamic behavior of liquid in a laboratory-scaled small tank under a wide frequency range of input excitation is experimentally analyzed.

A comparison of working alliance, session evaluation and participants' experience of university student clients by counseling media -Comparison of face-to-face, phone, video, and video with digital mask counseling- (대학생 내담자를 대상으로 한 상담 작업동맹과 회기 평가 및 내담자 경험 비교 연구 - 전화, 화상 및 디지털가면 화상상담과 대면상담 비교 -)

  • Cho, Eunsuk;Oh, Yoon-Seok;Jang, Eun-Hee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.49-58
    • /
    • 2022
  • The purpose of this study is to find out how on-line counseling modalities (phone, video, and video counseling using digital mask) differ from face-to-face counseling in terms of clients' perception of working alliance, depth and smoothness of each session, satisfaction, and their qualitative counseling experience. 40 university students participated in the experiment, divided into 4 groups, received 3 personal counseling sessions per person. The quantitative data revealed no significant difference among the four counseling groups in working alliance. Also, the "depth" of the session was similar in the four groups, but phone and video with mask counseling group who did not expose their faces showed higher "smoothness" in the first and second sessions than face-to-face counseling group, indicating that anonymity was helping the clients' inhibition overcome. Through the post-interview data, subtle differences in experience of each counseling method were identified by the participants. The results are expected to provide primary information for developing and implementing various online counseling modalities in the future.

Frame Rate Up Conversion for Multi-View Video (다시점 영상의 프레임율 변환 기법)

  • Yang, YoonMo;Lee, Dohoon;Oh, Byung Tae
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2016.06a
    • /
    • pp.228-230
    • /
    • 2016
  • In this paper, we introduce new FRUC method for Multi-View Video based on DIBR (Depth Image based Rendering, DIBR). In the proposed method, we divide each block to sub-regions using depth map. Then, we reconstruct occlusion region information at each sub-regions by using other views. With reconstructed occlusion region information, we estimate and compensate each sub-regions' motion. The proposed method estimates more accurate motion compared to the conventional methods in occlusion region.

  • PDF