Browse > Article
http://dx.doi.org/10.5909/JBE.2021.26.5.599

Reinforcement Learning based Inactive Region Padding Method  

Kim, Dongsin (Korea Aerospace University)
Uddin, Kutub (Korea Aerospace University)
Oh, Byung Tae (Korea Aerospace University)
Publication Information
Journal of Broadcast Engineering / v.26, no.5, 2021 , pp. 599-607 More about this Journal
Abstract
Inactive region means a region filled with invalid pixel values to represent a specific image. Generally, inactive regions are occurred when the non-rectangular formatted images are converted to the rectangular shaped image, especially when 3D images are represented in 2D format. Because these inactive regions highly degrade the compression efficiency, filtering approaches are often applied to the boundaries between active and inactive regions. However, the image characteristics are not carefully considered during filtering. In the proposed method, inactive regions are padded through reinforcement learning that can consider the compression process and the image characteristics. Experimental results show that the proposed method performs an average of 3.4% better than the conventional padding method.
Keywords
Inactive region padding; Reinforcement learning; Deep learning; Immersive video;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Yu, H. Lakshman, and B. Girod, "A framework to evaluate omnidirectional video coding schemes," in IEEE International Symposium on Mixed and Augmented Reality, pp. 31-36, 2015.
2 B. Salahieh, B. Kroon, J. Jung, M. Domanski (Eds.), "Test model 2 for Immersive Video," ISO/IEC JTC1/SC29/WG11, N18577, July 2019.
3 Y. Ye, E. Alshina, and J. Boyce, "Algorithm descriptions of projection format conversion and video quality metrics in 360Lib (Version 5)," Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-H1004, Oct. 2017.
4 G. Sullivan, J. Ohm, W. Han, and T. Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard", IEEE Transactions on Circuits and Systems for Video Technology, Vol.22, No.12, pp.1649-1668, December 2012.   DOI
5 Eldesokey, Abdelrahman, Michael Felsberg, and Fahad Shahbaz Khan. "Confidence propagation through cnns for guided sparse depth regression." IEEE transactions on pattern analysis and machine intelligence 42.10: 2423-2436, 2019.   DOI
6 A. Abbas, "AHG8: An Update on RSP Projection," Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-H0056, Oct. 2017.
7 Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. PMLR, 2016.
8 Y. Sun, A. Lu, and L. Yu, "AHG8: WS-PSNR for 360 video objective quality evaluation," in Joint Video Exploration Team of ITU-T SG16WP3 and ISO/IEC JTC1/SC29/WG11, JVET-D0040, Chengdu, 2016.
9 Y.-H. Lee, H.-C. Lin, J.-L. Lin, S.-K. Chang, C.-C. Ju, "EE4: ERP/EAP-based segmented sphere projection with different padding sizes," Joint Video Exploration Team of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, JVET-G0097, Jul. 2017.
10 Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems 27, 2014.
11 Takeda, Hiroyuki, Sina Farsiu, and Peyman Milanfar. "Kernel regression for image processing and reconstruction." IEEE Transactions on image processing 16.2: 349-366, 2007.   DOI
12 Bjontegaard, G. "Calculation of average PSNR differences between RD-curves." VCEG-M33, 2001.
13 J. Lee, J. Park, H. Choi, J. Byeon, and D. Sim, "Overview of VVC", Broadcasting and Media Magazine, Vol.24, No.4, pp.10-25, October 2019.