Browse > Article
http://dx.doi.org/10.9717/kmms.2022.25.6.783

Multi-Layer Perceptron Based Ternary Tree Partitioning Decision Method for Versatile Video Coding  

Lee, Taesik (Department of Computer Engineering, Dong-A University)
Jun, Dongsan (Department of Computer Engineering, Dong-A University)
Publication Information
Abstract
Versatile Video Coding (VVC) is the latest video coding standard, which had been developed by the Joint Video Experts Team (JVET) of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG) in 2020. Although VVC can provide powerful coding performance, it requires tremendous computational complexity to determine the optimal block structures during the encoding process. In this paper, we propose a fast ternary tree decision method using two neural networks with 7 nodes as input vector based on the multi-layer perceptron structure, names STH-NN and STV-NN. As a training result of neural network, the STH-NN and STV-NN achieved accuracies of 85% and 91%, respectively. Experimental results show that the proposed method reduces the encoding complexity up to 25% with unnoticeable coding loss compared to the VVC test model (VTM).
Keywords
Block Structure; Fast Encoding; Multi-Layer Perceptron; Neural Network; Ternary Tree; Versatile Video Coding;
Citations & Related Records
연도 인용수 순위
  • Reference
1 JVET, AHG Report: Test Model Software Development (AHG3), JVET-W0003, v.1, 2021.
2 JVET, CE:3 Tests of Cross-Component Linear Model in BMS1.0 (Test 4.1.8, 4.1.9, 4.1.10, 4.1.11), JVET-K0190, v.1, 2018.
3 JVET, CE3-Related: Wide-Angle Intra Prediction for Non-Square Blocks, JVET-K0500, v.4, 2018.
4 Y. Choi and B. Kim, "A Review on Motion Estimation and Compensation for Versatile Video Coding Technology (VVC)," Journal of Korea Multimedia Society, Vol. 22, No. 7, pp. 770-779, 2019.   DOI
5 VTM 13.0, https://vcgit.hhi.fraunhofer.de/jvet/VVC Software_VTM/-/tags/VTM-13.0 (accessed March, 16, 2022).
6 D. Ma, F. Zhang, and D. Bull, "BVI-DVC: A Training Database for Deep Video Compression," arXiv P reprint, arXiv:2003.13552, 2020.
7 X. Glorot and Y. Bengio, "Understanding the Difficulty of Training Deep Feedforward Neural Networks," Proceedings of Machine Learning Research, pp. 249-256, 2010.
8 S. Park and J. Kang, "Context-Based Ternary Tree Decision Method in Versatile Video Coding for Fast Intra Coding," IEEE Access, Vol. 7, pp. 172597-172605, 2019.   DOI
9 JVET, CE3: Affine Linear Weighted Intra Prediction (CE3-4.1, CE3-4.2), JVET-N0217, v.1, 2019.
10 JVET, CE3: Simplified P DP C (Test 2.4.1), JVET-K0063, v.2, 2018.
11 JVET, CE3: Intra Sub-P artitions Coding Mode (Test 1.1.1 and 1.1.2), JVET-M0102, v.5, 2019.
12 JVET, VTM Common Test Conditions and Software Reference Configurations for SDR Video, JVET-T2010, v.1, 2020.
13 G.J. Sullivan, J.R. Ohm, W.J. Han, and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No. 12, pp. 1649-1668, 2012.   DOI
14 B. Bross, Y.K. Wang, Y. Ye, S. Liu, J. Chen, and G.J. Sullivan, et al., "Overview of the Versatile Video Coding (VVC) Standard and Its Applications," IEEE Transactions on Circuits and Systems for Video Technology, Vol. 31, No. 10, pp. 3736-3764, 2021.   DOI