Search | Korea Science

The Hardware Design of a High throughput CABAC Decoder for HEVC (높은 처리량을 갖는 HEVC CABAC 복호기 하드웨어 설계)

Kim, Hansik;Ryoo, Kwangki
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.17 no.2
- /
- pp.385-390
- /
- 2013
This paper proposes an efficient hardware architecture of CABAC for HEVC decoder. The proposed method is structured to handle two bins in one cycle, while preserving data dependencies of the CABAC. In addition, the processing time of the proposed architecture is reduced because the operation using Offset and Range is processed while the architecture reads rLPS from rLPSROM. As a result of analyzing operating frequency of the proposed CABAC architecture, the proposed architecture has improved by 40% than the previous one.
https://doi.org/10.6109/jkiice.2013.17.2.385 인용 PDF KSCI

A Design of Pipelined-parallel CABAC Decoder Adaptive to HEVC Syntax Elements (HEVC 구문요소에 적응적인 파이프라인-병렬 CABAC 복호화기 설계)

Bae, Bong-Hee;Kong, Jin-Hyeung
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.5
- /
- pp.155-164
- /
- 2015
This paper describes a design and implementation of CABAC decoder, which would handle HEVC syntax elements in adaptively pipelined-parallel computation manner. Even though CABAC offers the high compression rate, it is limited in decoding performance due to context-based sequential computation, and strong data dependency between context models, as well as decoding procedure bin by bin. In order to enhance the decoding computation of HEVC CABAC, the flag-type syntax elements are adaptively pipelined by precomputing consecutive flag-type ones; and multi-bin syntax elements are decoded by processing bins in parallel up to three. Further, in order to accelerate Binary Arithmetic Decoder by reducing the critical path delay, the update and renormalization of context modeling are precomputed parallel for the cases of LPS as well as MPS, and then the context modeling renewal is selected by the precedent decoding result. It is simulated that the new HEVC CABAC architecture could achieve the max. performance of 1.01 bins/cycle, which is two times faster with respect to the conventional approach. In ASIC design with 65nm library, the CABAC architecture would handle 224 Mbins/sec, which could decode QFHD HEVC video data in real time.
https://doi.org/10.5573/ieie.2015.52.5.155 인용 PDF KSCI

CU-Level Parallelization Method for HEVC Decoder (HEVC 디코더를 위한 CU 레벨 병렬화 기법)

Noh, Gyeong Gi;Choi, Kiho;Kim, Sowon;Jang, Euee S.
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2011.11a
- /
- pp.38-41
- /
- 2011
최근 HD급 이상의 해상도를 가지는 영상을 위한 차세대 코덱 표준이 연구되고 있다. 이 코덱의 특징은 압축효율을 증가시키기 위해서 시간을 많이 소모시키는 복잡한 툴들을 많이 채택하고 있다는 점이다. 이는 실시간 방송에 대한 부담감으로 작용되기 때문에, 표준을 재정하는 전문가들은 속도 개선을 위한 병렬화 연구 또한 동시에 진행을 하고 있다. 병렬화 방법 중 슬라이스 단위 병렬화와 모듈 내부 병렬화가 대표적으로 논의되고 있지만, 이 두 가지 방법은 각각 시간 지연과 추가 비트 할당이라는 단점이 있기 때문에 이를 극복하기 위한 새로운 병렬화 기법이 요구되고 있다. 본 논문에서는 시간 지연과 추가비트 할당을 극복 가능한 병렬화 기법을 연구하였는데, HEVC 코덱의 구조 분석을 통해 어떻게 병렬화 해야 단점을 극복할 수 있는지 알아보고 단점을 극복한 병렬화 기법이 속도 개선을 할 수 있는지 시간 분석을 통해 알아본다. 본 논문에서는 구조 분석을 통해 알아낸 CU 단위 병렬화 기법을 제안하고 CU 단위 병렬화 기법을 HEVC Test model reference software 2.1 decoder에 적용하여 Full HD 영상에 대해 Lowdelay에서 평균 19.83%의 속도 개선을 얻었으며, Randomaccess에서 평균 22.63%의 속도 개선을 얻었다.
PDF

The Efficient 32×32 Inverse Transform Design for High Performance HEVC Decoder (고성능 HEVC 복호기를 위한 효율적인 32×32 역변환기 설계)

Han, Geumhee;Ryoo, Kwangki
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.17 no.4
- /
- pp.953-958
- /
- 2013
In this paper, an efficient hardware architecture is proposed for $32{\times}32$ inverse transform HEVC decoder. HEVC is a new image compression standard to deal with much larger image sizes compared with conventional image codecs, such as 4k, 8k images. To process huge image data effectively, it adopts various new block structures. Theses blocks consists of $4{\times}4$, $8{\times}8$, $16{\times}16$, and $32{\times}32$ block. This paper suggests an effective structures to process $32{\times}32$ inverse transform. This structure of inverse transform adopts the decomposed $16{\times}16$ matrixes of $32{\times}32$ matrix, and simplified the operations by implementing multiplying with shifters and adders. Additionally the operations frequency is downed by using multicycle paths. Also this structure can be easily adopted to a multi-size transform or a forward transform block in HEVC codec.
https://doi.org/10.6109/jkiice.2013.17.4.953 인용 PDF KSCI

A Interpolation Hardware Architecture for HEVC Inter-Prediction Decoder Using Parallel Process (병렬처리를 이용한 HEVC 디코더의 화면간 예측 보간 필터 하드웨어 구조)

Choi, Seung-Hwan;Bae, Jong-Woo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2015.04a
- /
- pp.950-953
- /
- 2015
본 논문에서는 HEVC 디코더에서 화면간 예측의 보간 필터에 대한 하드웨어 구조를 제시하고, 설계 및 분석결과를 통해 연구 결론을 도출하는 것이 목적이다. 제안하는 하드웨어 구조는 보간 필터의 각 필터 간의 유사성을 확인하고 빠르게 데이터를 처리하기 위한 병렬처리 방법을 제시한다. 또한 레지스터를 통한 데이터를 재사용하는 방식을 이용하여 외부 메모리와의 불필요한 연결을 줄여 성능을 향상시켰다.
https://doi.org/10.3745/PKIPS.y2015m04a.950 인용 PDF

Low Complexity Motion Compensation Method for HEVC Decoder (HEVC 복호화기를 위한 저 복잡도 움직임 보상 방법)

Lee, Hoyoung;Jeon, Byeungwoo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2013.11a
- /
- pp.176-177
- /
- 2013
최신 비디오 부호화 표준인 HEVC는 종래의 H.264/AVC에 비해 높은 부호화 효율을 달성하는 반면, 연산 복잡도 또한 크게 증가하여, 제한된 자원을 가진 휴대 단말에서 고화질 및 고해상도 영상의 실시간 복원이 어려운 문제점이 있다. 이러한 문제를 해결하기 위해, 본 논문에서는 HEVC 복호화기의 연산 복잡도를 감소시키기 위한 저 복잡도의 움직임 보상 기술을 제안한다. 제안 방법은 참조 픽셀 간의 유사성을 측정하여, 유사성이 높은 예측 단위에 대해 간략한 보간 필터를 적용함으로써 HEVC 복화기의 연산 복잡도를 감소시킨다. 실험 결과를 통해 제안 방법은 HEVC 복호화기의 연산 복잡도를 최대 13.5%를 감소시킬 수 있으며, 그에 따른 화질 열화는 약 0.48 dB로 크지 않는 것을 확인하였다. 뿐만 아니라, 제안 방법은 임계값의 조절을 통해 연산 복잡도 조절 복호화기의 실현 가능성을 확인할 수 있었다.
PDF

Motion Estimation and Coding Technique using Adaptive Motion Vector Resolution in HEVC (HEVC에서의 적응적 움직임 벡터 해상도를 이용한 움직임 추정 및 부호화 기법)

Lim, Sung-Won;Lee, Ju Ock;Moon, Joo-Hee
- Journal of Broadcast Engineering
- /
- v.17 no.6
- /
- pp.1029-1039
- /
- 2012
In this papar, we propose a new motion estimation and coding technique using adaptive motion vector resolution. Currently, HEVC encodes a video using 1/4 motion vector resolution. If there are high texture regions in a picture, HEVC can't get a performance enough. So, we insert additional 1-bit flag meaning whether motion vector resolution is 1/4 or 1/8 in PU syntax. Therefore, decoder can recognize the transmitted motion vector resolution. Experimental results show that maximum coding efficiency gain of the proposed method is up to 5.3% in luminance and 7.9% in chrominance. Average computional time complexity is increased about 33% in encoder and up to 5% in decoder.
https://doi.org/10.5909/JBE.2012.17.6.1029 인용 PDF KSCI

Performance Analysis of HEVC Parallelization Methods for High-Resolution Videos

Ryu, Hochan;Ahn, Yong-Jo;Mok, Jung-Soo;Sim, Donggyu
- IEIE Transactions on Smart Processing and Computing
- /
- v.4 no.1
- /
- pp.28-34
- /
- 2015
Several parallelization methods that can be applied to High Efficiency Video Coding (HEVC) decoders are evaluated. The market requirements of high-resolution videos, such as Full HD and UHD, have been increasing. To satisfy the market requirements, several parallelization methods for HEVC decoders have been studied. Understanding these parallelization methods and objective comparisons of these methods are crucial to the real-time decoding of high-resolution videos. This paper introduces the parallelization methods that can be used in HEVC decoders and evaluates the parallelization methods comparatively. The experimental results show that the average speed-up factors of tile-level parallelism, wavefront parallel processing (WPP), frame-level parallelism, and 2D-wavefront parallelism are observed up to 4.59, 4.00, 2.20, and 3.16, respectively.
https://doi.org/10.5573/IEIESPC.2015.4.1.028 인용 PDF KSCI

Improving Encoder Complexity and Coding Method of the Split Information in HEVC (HEVC에서 인코더 계산 복잡도 개선 및 분할 정보 부호화 방법)

Lee, Han-Soo;Kim, Kyung-Yong;Kim, Tae-Ryong;Park, Gwang-Hoon;Kim, Hui-Yong;Lim, Sung-Chang;Lee, Jin-Ho
- Journal of Broadcast Engineering
- /
- v.17 no.2
- /
- pp.325-343
- /
- 2012
This paper proposes the coding method to predict the split structure of LCU in the current frame on the basis of the reference frame or temporally-previous frame. HEVC encoder determines split structure according to image characteristics in LCU which is an basic element of CU. The split structure of the current LCU is very similar to the split structure of collocated LCU in the reference frame or temporally-previous frame. Thus, this paper proposes the method to reduce the encoder computational complexity by predicting split structure of the current LCU on the basis of that of collocated LCU in the reference frame or temporally-previous frame. And it also proposes the method to reduce the BD-Bitrate by coding after the prediction of the CU split information. The simulation results of changing only encoder showed that the mean of encoder computational complexity was lower by 21.3%, the decoder computational complexity was negligible change and the BD-Bitrate increase by the maximum of 0.6%. Also, the method changing encoder, bitstream, and decoder improves the mean of encoder computational complexity was lower by 22%, the decoder computational complexity was negligible change and the BD-Bitrate is improved to the maximum of 0.3%. When compared with the conventional method, indicating that the proposed method is superior.
https://doi.org/10.5909/JEB.2012.17.2.325 인용 PDF KSCI

Parallel Method for HEVC Deblocking Filter based on Coding Unit Depth Information (코딩 유닛 깊이 정보를 이용한 HEVC 디블록킹 필터의 병렬화 기법)

Jo, Hyun-Ho;Ryu, Eun-Kyung;Nam, Jung-Hak;Sim, Dong-Gyu;Kim, Doo-Hyun;Song, Joon-Ho
- Journal of Broadcast Engineering
- /
- v.17 no.5
- /
- pp.742-755
- /
- 2012
In this paper, we propose a parallel deblocking algorithm to resolve workload imbalance when the deblocking filter of high efficiency video coding (HEVC) decoder is parallelized. In HEVC, the deblocking filter which is one of the in-loop filters conducts two-step filtering on vertical edges first and horizontal edges later. The deblocking filtering can be conducted with high-speed through data-level parallelism because there is no dependency between adjacent edges for deblocking filtering processes. However, workloads would be imbalanced among regions even though the same amount of data for each region is allocated, which causes performance loss of decoder parallelization. In this paper, we solve the problem for workload imbalance by predicting the complexity of deblocking filtering with coding unit (CU) depth information at a coding tree block (CTB) and by allocating the same amount of workload to each core. Experimental results show that the proposed method achieves average time saving (ATS) by 64.3%, compared to single core-based deblocking filtering and also achieves ATS by 6.7% on average and 13.5% on maximum, compared to the conventional uniform data-level parallelism.
https://doi.org/10.5909/JBE.2012.17.5.742 인용 PDF KSCI

Search Result 43, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)