DOI QR코드

DOI QR Code

Context-Adaptive Intra Prediction Model Training and Its Coding Performance Analysis

문맥적응적 화면내 예측 모델 학습 및 부호화 성능분석

  • Moon, Gihwa (Korea Aerospace University, School of Electronics and Information Engineering) ;
  • Park, Dohyeon (Korea Aerospace University, School of Electronics and Information Engineering) ;
  • Kim, Jae-Gon (Korea Aerospace University, School of Electronics and Information Engineering)
  • 문기화 (한국항공대학교 항공전자정보공학과) ;
  • 박도현 (한국항공대학교 항공전자정보공학과) ;
  • 김재곤 (한국항공대학교 항공전자정보공학과)
  • Received : 2021.08.23
  • Accepted : 2022.04.25
  • Published : 2022.05.30

Abstract

Recently, with the development of deep learning and artificial neural network technologies, research on the application of neural network has been actively conducted in the field of video coding. In particular, deep learning-based intra prediction is being studied as a way to overcome the performance limitations of the existing intra prediction techniques. This paper presents a method of context-adaptive neural network-based intra prediction model training and its coding performance analysis. In other words, in this paper, we implement and train a known intra prediction model based on convolutional neural network (CNN) that predicts a current block using contextual information from reference blocks. Then, we integrate the trained model into HM16.19 as an additional intra prediction mode and evaluate the coding performance of the trained model. Experimental results show that the trained model gives 0.28% BD-rate bit saving over HEVC in All Intra (AI) coding mode. In addition, the coding performance change of training considering block partition is also presented.

최근 딥러닝을 적용하는 비디오 압축에 대한 연구가 활발히 진행되고 있다. 특히, 화면내 예측 부호화의 성능 한계를 극복할 수 있는 방안으로 딥러닝 기반의 화면내 예측 부호화 기술이 연구되고 있다. 본 논문은 신경망 기반 문맥적응적 화면내 예측 모델의 학습기법과 그 부호화 성능분석을 제시한다. 즉, 본 논문에서는 주변 참조샘플의 문맥정보를 입력하여 현재블록을 예측하는 기존의 합성곱 신경망(CNN: Convolutional Neural network) 기반의 화면내 예측 모델을 학습한다. 학습된 화면내 예측 모델을 HEVC(High Efficiency Video Coding)의 참조 소프트웨어인 HM16.19에 추가적인 화면내 예측모드로 구현하고 그 부호화 성능을 분석하였다. 실험결과 학습한 예측 모델은 HEVC 대비 AI(All Intra) 모드에서 0.28% BD-rate 부호화 성능 향상을 보였다. 또한 비디오 부호화 블록분할 구조를 고려하여 학습한 경우의 성능도 확인하였다.

Keywords

Acknowledgement

This work was supported by Institute of Information and communications Technology Planning and Evaluation (IITP) grant funded by Korea Government (MSIT) (2017-0-00486).

References

  1. High Efficiency Video Coding, Version 1, Rec. ITU-T H.265, ISO/IEC 23008-2, Jan. 2013. doi: 10.1007/978-3-319-06895-4
  2. Versatile Video Coding, ISO/IEC FDIS 23090-3, Jul. 2020.
  3. S. Liu, E. Alshina, J. Pfaff, M. Wien, P. Wu, and Y. Ye, "JVET AHG report: Neural-network-based video coding," Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, JVET-V0011, Apr. 2021.
  4. Alshina, S. Lui, W. Chen, F. Galpin, Y. Li, Z. Ma, H. Wang, "EE1: Summary of exploration experiments on neural network-based video coding," Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, JVET-W0023, July. 2021.
  5. "Use cases and requirements for deep neural networks based video coding," ISO/IEC JTC 1/SC 29/WG 2, N22, Oct. 2020.
  6. J. Li, B. LI, J. Xu, R. Xiong, and W. Gao, "Fully connected network-based intra prediction for image coding," IEEE Trans. Image Proc., vol. 27, no. 7, Mar. 2018. doi: 10.1109/TIP.2018.2817044
  7. T. Dumas, A. Roumy, and C. Guillemot, "Context adaptive neural network based prediction for image compression," IEEE Trans. Image Proc., vol. 29, Aug. 2019. doi: 10.1109/TIP.2019.2934565
  8. T. Dumas, F. Galpin, P. Bordes, and F. Leleannec (InterDigital), "AHG11: BD-rate gains vs complexity of NN-based intra prediction," Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29, JVET-W0081, July 2021.
  9. D. Ma, F. Zhang and D. Bull, "BVI-DVC: A training database for deep video compression," 2020, arXiv:2003.13552. https://data.bris.ac.uk/data/dataset/3hj4t64fkbrgn2ghwp9en4vhtn
  10. J. Boyce, K. Suehring, X. Li, and V. Seregin, "JVET common test conditions and software reference configurations," Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Apr. 2018.