DOI QR코드

DOI QR Code

Accurate Prediction of VVC Intra-coded Block using Convolutional Neural Network

VVC 화면 내 예측에서의 딥러닝 기반 예측 블록 개선을 통한 부호화 효율 향상 기법

  • Jeong, Hye-Sun (Department of Electronic and Electrical Engineering and Graduate Program in Smart Factory, Ewha Womans University) ;
  • Kang, Je-Won (Department of Electronic and Electrical Engineering and Graduate Program in Smart Factory, Ewha Womans University)
  • 정혜선 (이화여자대학교 전자전기공학전공 스마트팩토리 프로그램) ;
  • 강제원 (이화여자대학교 전자전기공학전공 스마트팩토리 프로그램)
  • Received : 2022.04.18
  • Accepted : 2022.06.27
  • Published : 2022.07.30

Abstract

In this paper, we propose a novel intra-prediction method using convolutional neural network (CNN) to improve a quality of a predicted block in VVC. The proposed algorithm goes through a two-step procedure. First, an input prediction block is generated using one of the VVC intra-prediction modes. Second, the prediction block is further refined through a CNN model, by inputting the prediction block itself and reconstructed reference samples in the boundary. The proposed algorithm outputs a refined block to reduce residual signals and enhance coding efficiency, which is enabled by a CU-level flag. Experimental results demonstrate that the proposed method achieves improved rate-distortion performance as compared a VVC reference software, I.e., VTM version 10.0.

본 논문에서는 컨볼루션 신경망 네트워크를 이용하여 VVC 화면 내 예측으로 얻은 예측 블록을 개선하여 잔차 신호를 보다 줄이는 화면 내 예측 방법을 제안한다. 기존의 화면 내 예측 방법은 일부 고정 규칙을 기반으로 주변의 재구성된 참조 샘플로부터 예측 블록을 생성하므로 복잡한 콘텐츠의 예측 블록을 생성하기 어렵다는 한계가 있다. 또한, 참조 샘플로 이용할 수 있는 정보의 양이 시간적 주변 정보에 비해 적기 때문에 화면 간 예측보다 낮은 부호화 성능을 가진다. 본 연구에서는 앞서 언급한 문제를 해결하기 위해 기존의 비디오 부호화 과정의 화면 내 예측을 통해 생성되는 예측 블록에 CNN을 적용하여 원본 블록과 예측 블록의 차분 신호를 줄이는 화면 내 예측 방법을 제안한다. 부호기에서는 제안 알고리즘의 활성 여부를 나타내는 플래그가 함께 부호화된다. 제안하는 화면 내 예측 방법은 최신 비디오 압축 표준인 Versatile Video Coding의 참조 모델인 VTM version 10.0 대비 휘도 성분에 대하여 향상된 압축 성능을 제공한다.

Keywords

Acknowledgement

This work was partly supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No. 2020-0-00920, Development of Ultra High Resolution Unstructured Plenoptic Video Storage/Compression/Streaming Technology for Medium to Large Space, 50%). and the ITRC(Information Technology Research Center) support program(IITP-2022-2020-0-01460, 50%) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation).

References

  1. Versatile Video Coding Test Model (VTM) 10.0 : https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tree/VTM-10.0
  2. T. Barnett, et al. "Cisco visual networking index (vni) complete forecast update, 2017-2022." Americas/EMEAR Cisco Knowledge Network (CKN) Presentation (2018).
  3. Moving Picture Experts Group (MPEG) : https://mpeg.chiariglione.org
  4. JVET of ITU-T and ISO/IEC, "Versatile Video Coding (Draft 10)", Documents JEVT-S2001, June 2020.
  5. B. Bross et al., "Overview of the Versatile Video Coding (VVC) Standard and its Applications"IEEE Transactions on Cicuits and Systems on Video Technologies, vol. 31, no. 10, 2021. doi: https://doi.org/10.1109/TCSVT.2021.3101953
  6. J. Pfaff et al., "Intra Prediction and Mode Coding in VVC", IEEE Transactions on Cicuits and Systems on Video Technologies, vol. 31, no. 10, 2021. doi: https://doi.org/10.1109/TCSVT.2021.3072430
  7. J. Li, et al. "Intra prediction using fully connected network for video coding." 2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017. doi: https://doi.org/10.1109/ICIP.2017.8296231
  8. JVET of ITU-T and ISO/IEC, "AHG11 Neural Network-based Intra Prediction with Transform selection in VVC", Document JVET-T0073, Oct. 2020.
  9. Y. Hu, et al. "Enhanced intra prediction with recurrent neural network in video coding." 2018 Data Compression Conference. IEEE, 2018. doi: https://doi.org/https://doi.org/10.1109/DCC.2018.00066
  10. Y. Hu, et al. "Optimized spatial recurrent network for intra prediction in video coding." 2018 IEEE Visual Communications and Image Processing (VCIP). IEEE, 2018. doi: https://doi.org/10.1109/VCIP.2018.8698658
  11. Y. Hu, et al. "Progressive spatial recurrent neural network for intra prediction." IEEE Transactions on Multimedia 21.12 (2019): 3024-3037. doi: https://doi.org/10.1109/TMM.2019.2920603
  12. Y. Wang, et al. "Multi-scale convolutional neural network-based intra prediction for video coding." IEEE Transactions on Circuits and Systems for Video Technology 30.7 (2019): 1803-1815. doi: https://doi.org/10.1109/TCSVT.2019.2934681
  13. L. Zhu, et al. "Generative adversarial network-based intra prediction for video coding." IEEE transactions on multimedia 22.1 (2019): 45-58. doi: https://doi.org/10.1109/TMM.2019.2924591
  14. F. Brand, S. Jurgen, and K. Andre. "Intra frame prediction for video coding using a conditional autoencoder approach." 2019 Picture Coding Symposium (PCS). IEEE, 2019. doi: https://doi.org/10.1109/PCS48520.2019.8954546
  15. M. G. Blanch, et al. "Chroma intra prediction with attention-based CNN architectures." 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 2020. doi: https://doi.org//ICIP40778.2020.9191050
  16. JVET of ITU-T and ISO/IEC, "AHG11: Neural Network based cross-component Prediction model", Deocument JVET-W0111, July 2021
  17. Sang-hyo Park and Je-Won Kang, "Fast Multi-type Tree Partitioning for Versatile Video Coding Using a Lightweight Neural Network", IEEE Transactions on Multimedia, 2021. doi: https://doi.org/10.1109/TMM.2020.3042062
  18. Jyung-Kyung Lee, Nayoung Kim, Seunghyun Cho, and Je-Won Kang, "Deep Video Prediction Network Based Inter-Frame Coding in HEVC," IEEE Access, 2020. doi: https://doi.org/10.1109/ACCESS.2020.2993566
  19. Sookyung Ryu and Je-Won Kang, "Machine Learning-Based Fast Angular Prediction Mode Decision Technique in Video Coding," IEEE Transaction on Image Processing, Nov. 2018. doi: https://doi.org/10.1109/TIP.2018.2857404
  20. Je-Won Kang, Soo-Kyung Ryu, Na-Young Kim, Minjoo Kang, "Efficient Residual DPCM using an L-1 Robust Linear Prediction in Screen Content Video Coding," IEEE Transaction on Multimedia, vol. 18, no. 10, pp.2054-2065, Oct. 2016. doi: https://doi.org/10.1109/TMM.2016.2595259
  21. Jyung-Kyung Lee, Nayoung Kim, and Je-Won Kang, "Rate-distortion optimized temporal segmentation using reinforcement leaning for video coding," APSIPA, 2021.
  22. Je-Won Kang, Gabbouj, M., and Jay Kuo, C. C. "Sparse/DCT (S/DCT) two-layered representation of prediction residuals for video coding", IEEE Transactions on Image Processing, 22(7), 2711-2722. doi: https://doi.org/10.1109/TIP.2013.2256917
  23. Radu Timofte, Eirikur Agustsson, Luc Van Gool, MingHsuan Yang, and Lei Zhang, "Ntire 2017 challenge on single image super-resolution: Methods and results," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 114-125. doi: https://doi.org/10.1109/cvprw.2017.150
  24. H. Jeong, Je-Won Kang "Improvements of intra-prediction in VVC", Summer conference in the Korean Society of Broadcast and Media Engineers, 2022
  25. H. Jeong, Je-Won Kang, "Intra prediction through block refinement", 34th Image processing and understanding, 2022.