DOI QR코드

DOI QR Code

복잡도 기반 적응적 샘플 오프셋 병렬화

Complexity-based Sample Adaptive Offset Parallelism

  • Ryu, Eun-Kyung (Department of Computer Engineering, Kwangwoon University) ;
  • Jo, Hyun-Ho (Department of Computer Engineering, Kwangwoon University) ;
  • Seo, Jung-Han (Department of Computer Engineering, Kwangwoon University) ;
  • Sim, Dong-Gyu (Department of Computer Engineering, Kwangwoon University) ;
  • Kim, Doo-Hyun (Samsung Advanced Institute of Technology) ;
  • Song, Joon-Ho (Samsung Advanced Institute of Technology)
  • 투고 : 2012.02.01
  • 심사 : 2012.04.12
  • 발행 : 2012.05.30

초록

본 논문은 High Efficiency Video Coding (HEVC)의 인-루프 필터 기술인 Sample Adaptive Offset (SAO)에 대하여 복잡도 분석기반의 병렬화 방법을 제안한다. HEVC의 SAO는 쿼드트리 기반으로 영상을 다수의 SAO영역으로 분할하고, 각 영역 단위로 에러 보정을 위한 오프셋 값을 전송함으로써 복호화된 화소의 에러를 보정한다. HEVC의 SAO는 데이터 레벨의 병렬화를 통하여 고속화할 수 있는데, SAO영역 단위의 데이터 레벨 병렬화는 영역의 크기가 일정하지 않아 멀티 코어를 사용한 병렬화시 작업량 불균형(Workload imbalance)이 발생한다. 또한, SAO는 영역 단위로 필터링 적용 여부가 결정되므로 균둥하게 SAO영역을 각 코어에 할당하더라도, 작업량 불균형이 발생할 수 있다. 본 논문에서는 SAO영역의 최소 단위인 Largest Coding Unit (LCU)를 SAO 수행의 기본단위로 하여, 각 단위에서의 SAO 파라미터 정보를 이용하여 복잡도를 미리 예측 하였다. 예측된 복잡도를 기반으로 각 코어에 균일하게 작업량이 할당될 수 있도록 영역을 코어에 적응적으로 할당하여 병렬화를 수행한 결과 순차 수행 기반 SAO에 비하여 2.38배, 영역 균등 SAO 병렬화 대비 21% 속도 향상되었다.

In this paper, we propose a complexity-based parallelization method of the sample adaptive offset (SAO) algorithm which is one of HEVC in-loop filters. The SAO algorithm can be regarded as region-based process and the regions are obtained and represented with a quad-tree scheme. A offset to minimize a reconstruction error is sent for each partitioned region. The SAO of the HEVC can be parallelized in data-level. However, because the sizes and complexities of the SAO regions are not regular, workload imbalance occurs with multi-core platform. In this paper, we propose a LCU-based SAO algorithm and a complexity prediction algorithm for each LCU. With the proposed complexity-based LCU processing, we found that the proposed algorithm is faster than the sequential implementation by a factor of 2.38 times. In addition, the proposed algorithm is faster than regular parallel implementation SAO by 21%.

키워드

참고문헌

  1. T. Wiegand, J.-R. Ohm, G. J. Sullivan, W.-J. Han, R. Joshi, T. K. Tan, and K. Ugur, "Special Section on the Joint Call for Proposals on High Efficiency Video Coding (HEVC) Standardization," IEEE Trans. Circuits Systems for Video Technol., vol. 20, no. 12, pp. 1661-1666, Dec. 2010. https://doi.org/10.1109/TCSVT.2010.2095692
  2. Thomas Wiegand, Woo-Jin Han, Benjamin Bross, Jens-Rainer Ohm, and Gary J. Sullivan, "WD3: Working Draft 3 of High-Efficiency Video Coding," JCTVC-E603, Joint Collaborative Team on Video Coding meeting, March 2011, Geneva, CH.
  3. 조현호, 서정한, 유은경, 심동규, "OpenMP를이용한HEVC 디블록킹필터의병렬화구현," 2011 한국방송공학회추계학술대회, 2011년11월.
  4. 서정한, 조현호, 유은경, 심동규, "OpenMP를이용한HEVC의ALF병렬화," 2011 한국멀티미디어학회추계학술대회, 2011년11월.
  5. 유은경, 조현호, 심동규, "OpenMP를이용한HEVC SAO 병렬화,"2011 한국멀티미디어학회추계학술대회, 2011년11월.
  6. Chih-Ming Fu, Ching-YehChen, Chia-YangTsai, Yu-WenHuang, and ShawminLei, "CE8 Subset3: Picture Quadtree Adaptive Offset," JCTVC-D122, Joint Collaborative Team on Video Coding meeting, January 2011, Deagu, KR.
  7. Chih-Ming Fu, Ching-YehChen, Chia-YangTsai, Yu-WenHuang, and ShawminLei, "CE13: Sample Adaptive Offset with LCU-Independent Decoding," JCTVC-E049, Joint Collaborative Team on Video Coding meeting, March 2011, Geneva, CH.
  8. Chih-Ming Fu, Ching-Yeh Chen, Yu-Wen Huang, and Shawmin Lei, "Sample Adaptive Offset for HEVC," IEEE 13th International Workshop on Multimedia Signal Processing (MMSP), 17-19 Oct. 2011.
  9. Jike Chong, N. Satish, B. Catanzaro, K. Ravindran, and K. Keutzer, "Efficient Parallelization of H.264 Decoding with Macro Block Level Scheduling," Multimedia and Expo, 2007 IEEE International Conference on, pp. 1874-1877, 2007.
  10. K. Nishihara, A. Hatabu, and T. Moriyoshi, "Parallelization of H.264 video decoder for embedded multicore processor," Multimedia and Expo, 2008 IEEE International Conference on, pp. 329-332, 2008.
  11. Song Hyun Jo, Seongmin Jo, and Yong Ho Song, "Efficient Coordination of Parallel Threads of H.264/AVC Decoder for Performance Improvement," Consumer Electronics, IEEE Transactions on, Aug. 2010, pp.1963-1971.
  12. C. Meenderinck, A. Azevedo, M. Alvarez, B. Juurlink, and A. Ramirez, "Parallel Scalability of H.264", Proceedings of the first Workshop on Programmability Issues for Multi-Core Computers, January 2008.
  13. 남정학, 지봉일, 조현호, 심동규, 조대성, "슬라이스기반비디오코덱병렬화방법", 전자공학회논문지, 제47권, SP편6호, 48-56쪽, 2010년11월.