[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2021.12.012

A Multi-category Task for Bitrate Interval Prediction with the Target Perceptual Quality

Yang, Zhenwei (School of Communication and Information Engineering, Shanghai University)
Shen, Liquan (Shanghai Institute for Advanced Communicationand Data Science, Shanghai University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.12, 2021 , pp. 4476-4491 More about this Journal

Abstract

Video service providers tend to face user network problems in the process of transmitting video streams. They strive to provide user with superior video quality in a limited bitrate environment. It is necessary to accurately determine the target bitrate range of the video under different quality requirements. Recently, several schemes have been proposed to meet this requirement. However, they do not take the impact of visual influence into account. In this paper, we propose a new multi-category model to accurately predict the target bitrate range with target visual quality by machine learning. Firstly, a dataset is constructed to generate multi-category models by machine learning. The quality score ladders and the corresponding bitrate-interval categories are defined in the dataset. Secondly, several types of spatial-temporal features related to VMAF evaluation metrics and visual factors are extracted and processed statistically for classification. Finally, bitrate prediction models trained on the dataset by RandomForest classifier can be used to accurately predict the target bitrate of the input videos with target video quality. The classification prediction accuracy of the model reaches 0.705 and the encoded video which is compressed by the bitrate predicted by the model can achieve the target perceptual quality.

Keywords

Perceptual coding; Bitrate prediction; Rate control; Machine Learning; Feature Extraction;

Citations & Related Records

Reference

1	S. Wolf and M. Pinson, "Video quality measurement techniques," NTIA, Washington D.C., Tech. Rep. 02-392, Jun. 2002.
2	A. Zvezdakova, S. Zvezdakov, D. Kulikov, and D. Vatolin, "Hacking vmaf with video color and contrast distortion," arXiv preprint, 2019.
3	X. Shen, Z. Ni, W. Yang, X. Zhang, S. Wang and S. Kwong, "Just Noticeable Distortion Profile Inference: A Patch-Level Structural Visibility Learning Approach," IEEE Transactions on Image Processing, vol. 30, pp. 26-38, 2021. DOI
4	G. J. Sullivan, J. M. Boyce, Y. Chen, J. Ohm, C. A. Segall and A. Vetro, "Standardized Extensions of High Efficiency Video Coding (HEVC)," IEEE Journal of selected topics in Signal Processing, vol. 7, no. 6, pp. 1001-1016, Dec. 2013. DOI
5	"Subjective Video Quality Assessment Methods for Multimedia Applications," ITU-R Rec. P.910, 1999.
6	S. Mallat and F. Falzon, "Analysis of low bit rate image transform coding," IEEE Transactions on Signal Processing, vol. 46, no. 4, pp. 1027-1042, April 1998. DOI
7	Ozer J. "Fine-Tune Your Encoding With Objective Quality Metrics - Video and Handout," Dec. 2019.[Online]. Available: https://streaminglearningcenter.com/learning/fine-tune-your-encoding-with-objective-quality-metrics-video-and-handout.html
8	H. Yang, L. Shen, X. Dong, Q. Ding, P. An and G. Jiang, "Low-Complexity CTU Partition Structure Decision and Fast Intra Mode Decision for Versatile Video Coding," IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1668-1682, Jun. 2020. DOI
9	I. Katsavounidis, "Dynamic optimizer - a perceptual video encoding optimization framework," Netflix, Los Gatos, CA, USA, The Netflix Tech Blog, Mar.2018. [Online].Available:https://netflixtechblog.com/dynamic-optimizer-a-perceptual-video-encoding-optimization-framework-e19f1e3a277f
10	S. John, A. Gadde and B. Adsumilli, "Rate Distortion Optimization Over Large Scale Video Corpus With Machine Learning," in Proc. of IEEE International Conference on Image Processing (ICIP), pp. 1286-1290, Oct. 2020.
11	S. Hu, H. Wang and C. -. J. Kuo, "A GMM-based stair quality model for human perceived JPEG images," in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1070-1074, Mar. 2016.
12	A. V. Katsenou, J. Sole and D. R. Bull, "Content-gnostic Bitrate Ladder Prediction for Adaptive Video Streaming," in Proc. of Picture Coding Symposium (PCS), pp. 1-5, Nov. 2019.
13	N. Kim and B. Lee, "Analysis and Improvement of MPEG-DASH-based Internet Live Broadcasting Services in Real-world Environments," KSII Transactions on Internet and Information Systems, vol. 13, no. 5, pp. 2544-2557, May 2019. DOI
14	Z. Li, A. Aaron, I. Katsavounidis, A. Moorthy, and M. Manohara, "Toward a practical perceptual video quality metric," Netflix, Los Gatos, CA, USA, The Netflix Tech Blog, 2016. [Online]. Available:https://medium.com/netflix-techblog/toward-a-practicalperceptualvideoquality-metric-653f208b9652
15	L. Shen, Z. Zhang and Z. Liu, "Adaptive Inter-Mode Decision for HEVC Jointly Utilizing Inter-Level and Spatiotemporal Correlations," IEEE Transactions on Circuits and Systems for Video Technology, vol. 24, no. 10, pp. 1709-1722, Oct. 2014. DOI
16	S. Ling, Y. Baveye, P. L. Callet, J. Skinner and I. Katsavounidis, "Towards Perceptually-Optimized Compression of User Generated Content (UGC): Prediction Of UGC Rate-Distortion Category," in Proc. of IEEE International Conference on Multimedia and Expo (ICME), pp. 1-6, Jul. 2020.
17	Ozer J. "Finding the Just Noticeable Difference with Netflix VMAF," Sep. 2017. [Online]. Available: https://streaminglearningcenter.com/codecs/finding-the-just-noticeable-difference-with-netflix-vmaf.html
18	Carreira J, Noland E, Hillier C, et al. "A short note on the kinetics-700 human action dataset," arXiv preprint, 2019.
19	J. De Cock, Z. Li, M. Manohara and A. Aaron, "Complexity-based consistent-quality encoding in the cloud," in Proc. of IEEE International Conference on Image Processing (ICIP), pp. 1484-1488, Sep. 2016.
20	C. Chen, Y. Lin, S. Benting, and A. Kokaram, "Optimized Transcoding for Large Scale Adaptive Streaming Using Playback Statistics," in Proc. of IEEE International Conference on Image Processing (ICIP), pp. 3269-3273, Oct 2018.
21	S. Meng, Y. Li, Y. Liao, J. Li and S. Wang, "Learning to encode usergenerated short videos with lower bitrate and the same perceptual quality," in Proc. of IEEE International Conference on Visual Communications and Image Processing (VCIP), pp. 383-386, Dec. 2020.
22	N. Kamaci, Y. Altunbasak and R. M. Mersereau, "Frame bit allocation for the H.264/AVC video coder via Cauchy-density-based rate and distortion models," IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 8, pp. 994-1006, Aug. 2005. DOI
23	Wang H, Katsavounidis I, Zhou J, et al. "VideoSet: A large-scale compressed video quality dataset based on JND measurement," Journal of Visual Communication and Image Representation, vol 46, pp. 292-302, 2017. DOI
24	Z. Li, C. Bampis, J. Novak, A. Aaron, K. Swanson, A. Moorthy, and J. Cock, "Vmaf: The journey continues," Netflix, Los Gatos, CA, USA, The Netflix Tech Blog, Oct. 2018. [Online]. Available:https://netflixtechblog.com/vmaf-the-journeycontinues-44b51ee9ed12
25	Z. Liu, L. Wang, X. Li and X. Ji, "Optimize x265 Rate Control: An Exploration of Lookahead in Frame Bit Allocation and Slice Type Decision," IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2558-2573, May 2019. DOI
26	G. J. Sullivan, J. Ohm, W. Han and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, Dec. 2012. DOI