DOI QR코드

DOI QR Code

Luma Mapping Function Generation Method Using Attention Map of Convolutional Neural Network in Versatile Video Coding Encoder

VVC 인코더에서 합성 곱 신경망의 어텐션 맵을 이용한 휘도 매핑 함수 생성 방법

  • Received : 2021.05.12
  • Accepted : 2021.06.16
  • Published : 2021.07.30

Abstract

In this paper, we propose a method for generating luma signal mapping function to improve the coding efficiency of luma signal mapping methods in LMCS. In this paper, we propose a method to reflect the cognitive and perceptual features by multiplying the attention map of convolutional neural networks on local spatial variance used to reflect local features in the existing LMCS. To evaluate the performance of the proposed method, BD-rate is compared with VTM-12.0 using classes A1, A2, B, C and D of MPEG standard test sequences under AI (All Intra) conditions. As a result of experiments, the proposed method in this paper shows improvement in performance the average of -0.07% for luma components in terms of BD-rate performance compared to VTM-12.0 and encoding/decoding time is almost the same.

본 논문에서는 VVC의 LMCS에서 휘도 신호 매핑 방법의 부호화 효율을 향상시키기 위한 휘도 신호 매핑 함수 생성 방법을 제안한다. 본 논문에서 제안하는 방법은 기존 LMCS에서 지역적 특징을 반영하기 위하여 사용하는 지역적 공간 분산에 합성 곱 신경망의 어텐션 맵을 곱하여 인지 지각적 특징을 추가적으로 반영한다. 제안하는 방법의 성능 평가를 위하여 AI (All Intra) 조건에서 VVC 표준 실험 영상의 A1, A2, B, C, D 클래스를 이용하여 VTM-12.0과 BD-rate 성능을 비교한다. 실험 결과로서 본 논문에서 제안하는 방법이 VTM-12.0 대비 BD-rate 성능 관점에서 휘도 성분이 평균 -0.07%의 성능 향상을 보이고, 부/복호화 시간은 거의 동일하다.

Keywords

Acknowledgement

본 연구는 과학기술정보통신부 및 정보통신기획평가원의 SW중심대학지원사업(2017-0-00096) 및 과학기술정보통신부 및 정보통신기획평가원의 대학ICT연구센터육성지원사업(IITP-2021-2016-0-00288)에 의해 연구되었음.

References

  1. G. Sullivan, J. Ohm, W. Han, and T. Wiegand, "Overview of the high efficiency video coding (HEVC) standard," Institute of Electrical and Electronics Engineers (IEEE) Transactions on circuits and systems for video technology, Vol.22, No.12, pp. 1649-1668, Dec. 2012.
  2. B. Bross, J. Chen, S. Liu, and Y.-K. Wang, "Versatile Video Coding (Draft 10)," JVET-S2001, Jul. 2020.
  3. J. Lee, J. Park, H. Choi, J. Byeon, and D. Sim, "Overview of VVC," Broadcasting and Media Magazine, Vol.24, No.4, pp. 10-25, Oct. 2019.
  4. D. Park, Y. Yun, and J. Kim, "VVC의 In-Loop Filter 기술," Broadcasting and Media Magazine, Vol.24, No.4, pp. 87-101, Oct. 2019.
  5. T. Lu, F. Pu, P. Yin, S. McCarthy, W. Husak, T. Chen, E. Francois, C. Chevance, F. Hiron, J. Chen, R. Liao, Y. Ye, and J. Luo, "Luma Mapping with Chroma Scaling in Versatile Video Coding," Data Compression Conference (DCC), Snowbird, UT, USA, pp. 193-202, 2020.
  6. VTM, https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM
  7. J. Im, U. Im, and D. Sim, "HDR/WCG 영상 압축을 위한 표준 기술 동향," Broadcasting and Media Magazine, Vol.21, No.1, pp. 59-69, 2016.
  8. Rec. ITU-R BT.2100-2, "Image parameter values for high dynamic range television for use in production and international programme exchange"
  9. A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification with deep convolutional neuralnetworks," In Neural Information Processing Systems (NIPS), 2012.
  10. L. Zhou, X. Song, J. Yao, L. Wang, and F. Chen, "Convolution Neural Network Filter (CNNF) for Intra Frame," JVET-I0022, Joint Video Exploration Team of ISO/IEC and ITU-T, Gwangju, Korea, Jan 2018.
  11. J. Kang, S. Kim, and K. Lee, "Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec," Institute of Electrical and Electronics Engineers (IEEE) International Conference on Image Processing (ICIP), pp. 26-30, 2017.
  12. F. Zhang, C. Feng and D. R. Bull, "Enhancing VVC Through Cnn-Based Post-Processing," Institute of Electrical and Electronics Engineers (IEEE) International Conference on Multimedia and Expo (ICME), pp. 1-6, 2020.
  13. H. Moon, and J. Kim, "CNN Based In-loop Filter in Versatile Video Coding (VVC)," Proceedings of the Korean Society of Broadcast Engineers Conference, The Korean Institute of Broadcast and Media Engineers, pp. 270-271, 2018.
  14. K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," In ICLR, 2015.
  15. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: towards real-time object detection with region proposal networks." Institute of Electrical and Electronics Engineers (IEEE) transactions on pattern analysis and machine intelligence Vol.39, No.6, pp. 1137-1149, 2017.
  16. E. Francois, F. Galpin, K. Naser, and P. de Lagrange, "AHG7/AHG15: Signalling of corrective values for chroma residual scaling," JVET-P0371, Oct. 2019.
  17. J. Deng, W. Dong, R. Socher, L. Li, K. Ki, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database." 2009 Institute of Electrical and Electronics Engineers (IEEE) conference on computer vision and pattern recognition, pp. 248-255, 2009.
  18. F. Bossen, J. Boyce, K. Suehring, X. Li, and V. Seregin, "JVET common test conditions and software reference configurations for SDR video," JVET-N1010, Mar. 2019.
  19. G. Bjontegaard, "Calculation of average PSNR differences between RDcurves," Tech. Rep. VCEGM33, Video Coding Experts Group (VCEG), 2001.
  20. ONNX Runtime, http://github.com/microsoft/onnxruntime, 2019.
  21. Bitstream InSights - VTM, http://www.digitalinsights.co.kr/products/