• Title/Summary/Keyword: 3D video coding

Search Result 193, Processing Time 0.023 seconds

Dense RGB-D Map-Based Human Tracking and Activity Recognition using Skin Joints Features and Self-Organizing Map

  • Farooq, Adnan;Jalal, Ahmad;Kamal, Shaharyar
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.5
    • /
    • pp.1856-1869
    • /
    • 2015
  • This paper addresses the issues of 3D human activity detection, tracking and recognition from RGB-D video sequences using a feature structured framework. During human tracking and activity recognition, initially, dense depth images are captured using depth camera. In order to track human silhouettes, we considered spatial/temporal continuity, constraints of human motion information and compute centroids of each activity based on chain coding mechanism and centroids point extraction. In body skin joints features, we estimate human body skin color to identify human body parts (i.e., head, hands, and feet) likely to extract joint points information. These joints points are further processed as feature extraction process including distance position features and centroid distance features. Lastly, self-organized maps are used to recognize different activities. Experimental results demonstrate that the proposed method is reliable and efficient in recognizing human poses at different realistic scenes. The proposed system should be applicable to different consumer application systems such as healthcare system, video surveillance system and indoor monitoring systems which track and recognize different activities of multiple users.

A Perception-based Color Correction Method for Multi-view Images

  • Shao, Feng;Jiang, Gangyi;Yu, Mei;Peng, Zongju
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.2
    • /
    • pp.390-407
    • /
    • 2011
  • Three-dimensional (3D) video technologies are becoming increasingly popular, as it can provide users with high quality and immersive experiences. However, color inconsistency between the camera views is an urgent problem to be solved in multi-view imaging. In this paper, a perception-based color correction method for multi-view images is proposed. In the proposed method, human visual sensitivity (VS) and visual attention (VA) models are incorporated into the correction process. Firstly, the VS property is used to reduce the computational complexity by removing these visual insensitive regions. Secondly, the VA property is used to improve the perceptual quality of local VA regions by performing VA-dependent color correction. Experimental results show that compared with other color correction methods, the proposed method can greatly promote the perceptual quality of local VA regions greatly and reduce the computational complexity, and obtain higher coding performance.

Wider Depth Dynamic Range Using Occupancy Map Correction for Immersive Video Coding (몰입형 비디오 부호화를 위한 점유맵 보정을 사용한 깊이의 동적 범위 확장)

  • Lim, Sung-Gyun;Hwang, Hyeon-Jong;Oh, Kwan-Jung;Jeong, Jun Young;Lee, Gwangsoon;Kim, Jae-Gon
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2022.06a
    • /
    • pp.1213-1215
    • /
    • 2022
  • 몰입형 비디오 부호화를 위한 MIV(MPEG Immersive Video) 표준은 제한된 3D 공간의 다양한 위치의 뷰(view)들을 효율적으로 압축하여 사용자에게 임의의 위치 및 방향에 대한 6 자유도(6DoF)의 몰입감을 제공한다. MIV 의 참조 소프트웨어인 TMIV(Test Model for Immersive Video)에서는 복수의 뷰 간 중복되는 영역을 제거하여 전송할 화소수를 줄이기 때문에 복호화기에서 렌더링(rendering)을 위해서 각 화소의 점유(occupancy) 정보도 전송되어야 한다. TMIV 는 점유맵을 깊이(depth) 아틀라스(atlas)에 포함하여 압축 전송하고, 부호화 오류로 인한 점유 정보 손실을 방지하기 위해 깊이값 표현을 위한 동적 범위의 일부를 보호대역(guard band)으로 할당한다. 이 보호대역을 줄여서 더 넓은 깊이값의 동적 범위를 사용하면 렌더링 화질을 개선시킬 수 있다. 따라서, 본 논문에서는 현재 TMIV 의 점유 정보 오류 분석을 바탕으로 이를 보정하는 기법을 제시하고, 깊이 동적 범위 확장에 따른 부호화 성능을 분석한다. 제안기법은 기존의 TMIV 와 비교하여 평균 1.3%의 BD-rate 성능 향상을 보여준다.

  • PDF

A Wavefront Array Processor Utilizing a Recursion Equation for ME/MC in the frequency Domain (주파수 영역에서의 움직임 예측 및 보상을 위한 재귀 방정식을 이용한 웨이브프런트 어레이 프로세서)

  • Lee, Joo-Heung;Ryu, Chul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.10C
    • /
    • pp.1000-1010
    • /
    • 2006
  • This paper proposes a new architecture for DCT-based motion estimation and compensation. Previous methods do riot take sufficient advantage of the sparseness of 2-D DCT coefficients to reduce execution time. We first derive a recursion equation to perform DCT domain motion estimation more efficiently; we then use it to develop a wavefront array processor (WAP) consisting of processing elements. In addition, we show that the recursion equation enables motion predicted images with different frequency bands, for example, from the images with low frequency components to the images with low and high frequency components. The wavefront way Processor can reconfigure to different motion estimation algorithms, such as logarithmic search and three step search, without architectural modifications. These properties can be effectively used to reduce the energy required for video encoding and decoding. The proposed WAP architecture achieves a significant reduction in computational complexity and processing time. It is also shown that the motion estimation algorithm in the transform domain using SAD (Sum of Absolute Differences) matching criterion maximizes PSNR and the compression ratio for the practical video coding applications when compared to tile motion estimation algorithm in the spatial domain using either SAD or SSD.

Motion-based Fast Fractional Motion Estimation Scheme for H.264/AVC (움직임 예측을 이용한 고속 부화소 움직임 추정기)

  • Lee, Kwang-Woo;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.3
    • /
    • pp.74-79
    • /
    • 2008
  • In an H.264/AVC video encoder, the motion estimation at fractional pixel accuracy improves a coding efficiency and image quality. However, it requires additional computation overheads for fractional search and interpolation, and thus, reducing the computation complexity of fractional search becomes more important. This paper proposes fast fractional search algorithms by combining the SASR(Simplified Adaptive Search Range) and the MSDSP(Mixed Small Diamond Search Pattern) with the predicted fractional motion vector. Compared with the full search and the prediction-based directional fractional pixel search, the proposed algorithms can reduce up to 93.2% and 81% of fractional search points, respectively with the maximum PSNR lost less than 0.04dB. Therefore, the proposed fast search algorithms are quite suitable for mobile applications requiring low power and complexity.

Joint Rate Control Scheme for Terrestrial Stereoscopic 3DTV Broadcast (스테레오스코픽 3차원 지상파 방송을 위한 합동 비트율 제어 연구)

  • Chang, Yongjun;Kim, Munchurl
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.11a
    • /
    • pp.14-17
    • /
    • 2010
  • Following the proliferation of three-dimensional video contents and displays, many terrestrial broadcasting companies prepare for starting stereoscopic 3DTV service. In terrestrial stereoscopic broadcast, it is a difficult task to code and transmit two video sequences while sustaining as high quality as 2DTV broadcast attains due to the limited bandwidth defined by the existing digital TV standards such as ATSC. Thus, a terrestrial 3DTV broadcasting system with heterogeneous video coding systems is considered for terrestrial 3DTV broadcast where the left image and right images are based on MPEG-2 and H.264/AVC, respectively, in order to achieve both high quality broadcasting service and compatibility for the existing 2DTV viewers. Without significant change in the current terrestrial broadcasting systems, we propose a joint rate control scheme for stereoscopic 3DTV service. The proposed joint rate control scheme applies to the MPEG-2 encoder a quadratic rate-quantization model which is adopted in the H.264/AVC. Then the controller is designed for the sum of two bit streams to meet the bandwidth requirement of broadcasting standards while the sum of image distortions is minimized by adjusting quantization parameter computed from the proposed optimization scheme. Besides, we also consider a condition on quality difference between the left and right images in the optimization. Experimental results demonstrate that the proposed bit rate control scheme outperforms the rate control method where each video coding standard uses its own bit rate control algorithm in terms of minimizing the mean image distortion as well as the mean value and the variation of absolute image quality differences.

  • PDF

Using a Multi-Faced Technique SPFACS Video Object Design Analysis of The AAM Algorithm Applies Smile Detection (다면기법 SPFACS 영상객체를 이용한 AAM 알고리즘 적용 미소검출 설계 분석)

  • Choi, Byungkwan
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.11 no.3
    • /
    • pp.99-112
    • /
    • 2015
  • Digital imaging technology has advanced beyond the limits of the multimedia industry IT convergence, and to develop a complex industry, particularly in the field of object recognition, face smart-phones associated with various Application technology are being actively researched. Recently, face recognition technology is evolving into an intelligent object recognition through image recognition technology, detection technology, the detection object recognition through image recognition processing techniques applied technology is applied to the IP camera through the 3D image object recognition technology Face Recognition been actively studied. In this paper, we first look at the essential human factor, technical factors and trends about the technology of the human object recognition based SPFACS(Smile Progress Facial Action Coding System)study measures the smile detection technology recognizes multi-faceted object recognition. Study Method: 1)Human cognitive skills necessary to analyze the 3D object imaging system was designed. 2)3D object recognition, face detection parameter identification and optimal measurement method using the AAM algorithm inside the proposals and 3)Face recognition objects (Face recognition Technology) to apply the result to the recognition of the person's teeth area detecting expression recognition demonstrated by the effect of extracting the feature points.

Macroblock-based Adaptive Interpolation Filter Method for Improving Coding Efficiency in H.264/AVC (H.264/AVC에서 부호화 효율 개선을 위한 매크로 블록 기반 적응 보간 필터 방법)

  • Yoon, Kun-Su;Kim, Jae-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.73-83
    • /
    • 2007
  • In this paper, we propose macroblock(MB)-based adaptive interpolation filter method for improving coding efficiency in H.264/AVC. In the proposed method, nine separable two-dimensional(2D) interpolation filters are applied for precisely compensating motions in various directions. The optimal cost function which considers the bit rate and distortion for coding the MB is defined. The filter is adaptively selected per MB for minimizing the defined cost function. In the experimental results, the proposed method shows more excellent in coding efficiency than the conventional methods for the various standard $QCIF(176{\times}144)/CIF(352{\times}288)$ video test sequences. It leads to about 6.25%(1 reference frame) and 3.46%(5 reference frames) bit rate reduction on average compared to the H.264/AVC.

A Study of Color Video Coding Using Adaptive Wavelet Transform (적응적 웨이블릿 변환을 이용한 컬러 비디오 영상 코딩에 관한 연구)

  • 김혜경;오해석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.538-540
    • /
    • 2000
  • 본 논문에서는 적응적인 웨이블릿 변환에 기초한 저속 비트율 비디오 코딩 방법의 새로운 알고리즘을 제안한다. 접근 방법은 양자화된 웨이블릿 계수들이 웨이블릿 서브밴드 구조내에서 중복성을 활용하는 메커니즘에 의해서 전처리 된다면 코딩 절차가 더욱 효과적으로 나타난다. 그러므로 본 논문에서는 코딩부분의 최적화 활동에 초점을 맞추어 완전한 중복 블록 움직임 보상된 에어프레임에서 일치를 확보하기 위해 이용하고, 향상된 코사인 윈도우를 적용하였다. 또한 웨이블릿 변환은 각 일치한 움직임 보상된 에러 프레임을 전체적인 에너지 컴팩션에 도달하도록 적용된다. 움직임 벡터의 수평적, 수직적 컴포넌트는 적응적 산술적 코딩을 사용하여 독립적으로 인코드되는 반면에 의미있는 웨이블릿 계수는 적응적 산술 코딩을 사용함에 의해서 비트-플레인 순서로 인코드된다. 제안된 부호기는 28Kbits에서 PSNR이 평균적으로 각각 대략 2.07과 1.38dB에 존재하는 H.263과 ZTE를 초과한다. 전체순서 코딩에 대하여도 3DWCVC 방법은 평균적으로 각각 0.35와 0.71dB을 나타내는 H.263과 ZTE보다 우수한 성능을 보인다.

  • PDF

Human Perception of Asymmetrical Three-Dimensional Image (비대칭적 3차원 영상에 대한 인간의 인지 특성)

  • Ha, Chang-Woo;Lee, Wan-Jae;Jin, Soon-Jong;Jeong, Je-Chang
    • Journal of Broadcast Engineering
    • /
    • v.12 no.1 s.34
    • /
    • pp.41-52
    • /
    • 2007
  • The 3DTV services can be seen as a general case of the multi-view video that has been receiving a significant attention lately. However, the key factors that influence the success of 3DTV are the availability of content, the ease of use, the quality of contents, and the reduction of cost. This paper deals primarily with the perceptual improvement in image quality, especially based on human factors. An optimal asymmetrical coding method for binocular and multi-view images is presented. The quantitative value of asymmetrical rate to maintain optimized subjective image quality is explored. Also we analyze how edges of 2D images affect on 3D perceptions and propose an edge-preserving algorithm to perform perceptual improvements. Experimental results demonstrate that the proposed algorithm enhances subjective image quality much better than conventional methods.