DOI QR코드

DOI QR Code

Parallel Deblocking Filter Based on Modified Order of Accessing the Coding Tree Units for HEVC on Multicore Processor

  • Lei, Haiwei (Key Laboratory of Instrumentation Science & Dynamic Measurement, Ministry of Education, North University of China) ;
  • Liu, Wenyi (Key Laboratory of Instrumentation Science & Dynamic Measurement, Ministry of Education, North University of China) ;
  • Wang, Anhong (School of Electronic Information Engineering, Taiyuan University of Science and Technology)
  • 투고 : 2016.04.11
  • 심사 : 2017.01.15
  • 발행 : 2017.03.31

초록

The deblocking filter (DF) reduces blocking artifacts in encoded video sequences, and thereby significantly improves the subjective and objective quality of videos. Statistics show that the DF accounts for 5-18% of the total decoding time in high-efficiency video coding. Therefore, speeding up the DF will improve codec performance, especially for the decoder. In view of the rapid development of multicore technology, we propose a parallel DF scheme based on a modified order of accessing the coding tree units (CTUs) by analyzing the data dependencies between adjacent CTUs. This enables the DF to run in parallel, providing accelerated performance and more flexibility in the degree of parallelism, as well as finer parallel granularity. We additionally solve the problems of variable privatization and thread synchronization in the parallelization of the DF. Finally, the DF module is parallelized based on the HM16.1 reference software using OpenMP technology. The acceleration performance is experimentally tested under various numbers of cores, and the results show that the proposed scheme is very effective at speeding up the DF.

키워드

참고문헌

  1. J. R. Ohm, G. J. Sullivan, H. Schwarz, T. Thiow Keng, and T. Wiegand, "Comparison of the Coding Efficiency of Video Coding Standards-Including High Efficiency Video Coding (HEVC)," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1669-1684, 2012. https://doi.org/10.1109/TCSVT.2012.2221192
  2. F. Bossen, B. Bross, K. Suhring, and D. Flynn, "HEVC Complexity and Implementation Analysis," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1685-1696, 2012. https://doi.org/10.1109/TCSVT.2012.2221255
  3. G. J. Sullivan, J. R. Ohm, H. Woo-Jin, and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, 2012. https://doi.org/10.1109/TCSVT.2012.2221191
  4. A. Norkin, G. Bjontegaard, A. Fuldseth, M. Narroschke, M. Ikeda, K. Andersson, et al., "HEVC Deblocking Filter," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1746-1754, 2012. https://doi.org/10.1109/TCSVT.2012.2223053
  5. A. Azevedo, C. Meenderinck, B. Juurlink, A. Terechko, J. Hoogerbrugge, M. Alvarez, et al., High Performance Embedded Architectures and Compilers, Springer-Verlag Berlin Heidelberg, New York, 2009.
  6. W. Nan, W. Mei, S. Huayou, R. Ju, and Z. Chunyuan, "A Parallel H.264 Encoder with CUDA: Mapping and Evaluation," in Proc. of 2012 IEEE 18th International Conference on Parallel and Distributed Systems (ICPADS) , pp. 276-283, 2012.
  7. R. Husemann, V. Roesler, J. V. Lima, and M. Gobbi, "Evaluation of CUDA GPU architecture as H.264 intra coding acceleration engine," in Proc. of the 19th Brazilian symposium on Multimedia and the web, pp. 177-180, 2013.
  8. E. Baaklini, H. Sbeity, and S. Niar, "H.264 parallel optimization on graphics processors," in Proc. of 5th International Conferences on Advances in Multimedia, pp. 109-114, April 21-26, 2013.
  9. A. Rodrigues, N. Roma, and L. Sousa, "p264: Open platform for designing parallel H.264/AVC video encoders on multi-core systems," in Proc. of 20th ACM Workshop on Network and Operating System Support for Digital Audio and Video, pp. 81-86, June 2- 4, 2010.
  10. C. Yan, Y. Zhang, J. Xu, F. Dai, J. Zhang, Q. Dai, et al., "Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors," IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 12, pp. 2077-2089, 2014. https://doi.org/10.1109/TCSVT.2014.2335852
  11. W. Zhu, J. Pan, H. Guo, and W. Sun, "Parallel optimization of motion estimation for video coding on cell BE processors," in Proc. of 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) , pp. 1-6, 2014.
  12. B. Wang, M. Alvarez-Mesa, C. C. Chi, and B. Juurlink, "Parallel H.264/AVC Motion Compensation for GPUs Using OpenCL," IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 3, pp. 525-531, 2015. https://doi.org/10.1109/TCSVT.2014.2344512
  13. C. Yan, Y. Zhang, F. Dai, J. Zhang, L. Li, and Q. Dai, "Efficient parallel HEVC intra-prediction on many-core processor," Electronics Letters, vol. 50, no.11, pp. 805-806, 2014. https://doi.org/10.1049/el.2014.0611
  14. C. Yan, Y. Zhang, J. Xu, F. Dai, L. Li, Q. Dai, et al., "A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors," IEEE Signal Processing Letters, vol. 21, no. 5, pp. 573-576, 2014. https://doi.org/10.1109/LSP.2014.2310494
  15. S. Vijay, C. Chakrabarti, and L. J. Karam, "Parallel deblocking filter for H.264 AVC/SVC," in Proc. of 2010 IEEE Workshop on Signal Processing Systems (SIPS), pp. 116-121, 2010.
  16. M. Kthiri, P. Kadionik, L. H, x00E, vi, H. Loukil, et al., "A parallel hardware architecture of deblocking filter in H264/AVC," in Proc. of 2010 9th International Symposium on Electronics and Telecommunications (ISETC), pp. 341-344, 2010.
  17. B. Pieters, C. F. J. Hollemeersch, J. D. Cock, P. Lambert, W. D. Neve, and R. V. d. Walle, "Parallel Deblocking Filtering in MPEG-4 AVC/H.264 on Massively Parallel Architectures," IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 1, pp. 96-100, 2011. https://doi.org/10.1109/TCSVT.2011.2105553
  18. Y. Zhang, C. Yan, F. Dai, and Y. Ma, "Efficient Parallel Framework for H.264/AVC Deblocking Filter on Many-Core Platform," IEEE Transactions on Multimedia, vol. 14, no. 3, pp. 510-524, 2012. https://doi.org/10.1109/TMM.2012.2190391
  19. B. Pieters, C. Hollemeersch, J. D. Cock, W. D. Neve, P. Lambert, and R. V. d. Walle, "Parallel deblocking filtering in H.264/AVC using multiple CPUs and GPUs," in Proc. of the 20th ACM international conference on Multimedia, pp.1013-1016, 2012.
  20. D. P. Prasad, S. Sonachalam, M. K. Kunchamwar, and N. R. Gunupudi, "Parallel processing architecture for H.264 deblocking filter on multi-core platforms," Image Processing: Algorithms and Systems X; and Parallel Processing for Imaging Applications II, pp. 829512-829512-10, 2012.
  21. M. Ikeda, J. Tanaka, and T. Suzuki, "Parallel deblocking filter," JCTVC-D263, JCT-VC, Daegu, Kr, 2011.
  22. C. Yan, Y. Zhang, F. Dai, J. Zhang, L. Li, and Q. Dai, "Parallel deblocking filter for HEVC on many-core processor," Electronics Letters, vol. 50, no. 5, pp. 367-368, 2014. https://doi.org/10.1049/el.2013.3235
  23. D. F. d. Souza, N. Roma, and L. Sousa, "Cooperative CPU+GPU deblocking filter parallelization for high performance HEVC video codecs," in Proc. of 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4993-4997, 2014.
  24. A. M. Kotra, M. Raulet, and O. Deforges, "Comparison of different parallel implementations for deblocking filter of HEVC," in Proc. of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2721-2725 2013.
  25. W. Shen, Q. Shang, S. Shen, Y. Fan, and X. Zeng, "A high-throughput VLSI architecture for deblocking filter in HEVC," in Proc. of 2013 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 673-676 , 2013.
  26. E. Ozcan, Y. Adibelli, and I. Hamzaoglu, "A high performance deblocking filter hardware for High Efficiency Video Coding," in Proc. of 2013 23rd International Conference on Field Programmable Logic and Applications (FPL), pp. 1-4, 2013.
  27. W. Cheng, Y. Fan, Y. Lu, Y. Jin, and X. Zeng, "A high-throughput HEVC deblocking filter VLSI architecture for 8kx4k application," in Proc. of 2015 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 605-608, 2015.
  28. C. M. Diniz, M. Shafique, F. V. Dalcin, S. Bampi, and J. Henkel, "A deblocking filter hardware architecture for the high efficiency video coding standard," in Proc. of Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 1509-1514, 2015.
  29. I. Hautala, J. Boutellier, J. Hannuksela, O. Silv, and x00E, "Programmable Low-Power Multicore Coprocessor Architecture for HEVC/H.265 In-Loop Filtering," IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 7, pp. 1217-1230, 2015. https://doi.org/10.1109/TCSVT.2014.2369744
  30. OpenMP. OpenMP Specifications [Online]. Available: http://openmp.org/wp/openmp-specifications/
  31. JCT-VC. Subversion repository for the HEVC test model version HM-16.1 [Online]. Available: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.1/
  32. F. Bossen, "Common Test Conditions and software reference configurations," JCTVC-J1100, JCT-VC, Stockholm, SE, 2012.
  33. Wikipedia. Amdahl's law [Online]. Available: https://en.wikipedia.org/wiki/Amdahl%27s_law