DOI QR코드

DOI QR Code

Recursive block splitting in feature-driven decoder-side depth estimation

  • Szydelko, Błazej (Institute of Multimedia Telecommunications, Poznan University of Technology) ;
  • Dziembowski, Adrian (Institute of Multimedia Telecommunications, Poznan University of Technology) ;
  • Mieloch, Dawid (Institute of Multimedia Telecommunications, Poznan University of Technology) ;
  • Domanski, Marek (Institute of Multimedia Telecommunications, Poznan University of Technology) ;
  • Lee, Gwangsoon (Immersive Media Research Section, Electronics and Telecommunications Research Institute)
  • 투고 : 2021.09.01
  • 심사 : 2021.12.17
  • 발행 : 2022.02.01

초록

This paper presents a study on the use of encoder-derived features in decoder-side depth estimation. The scheme of multiview video encoding does not require the transmission of depth maps (which carry the geometry of a three-dimensional scene) as only a set of input views and their parameters are compressed and packed into the bitstream, with a set of features that could make it easier to estimate geometry in the decoder. The paper proposes novel recursive block splitting for the feature extraction process and evaluates different scenarios of feature-driven decoder-side depth estimation, performed by assessing their influence on the bitrate of metadata, quality of the reconstructed video, and time of depth estimation. As efficient encoding of multiview sequences became one of the main scopes of the video encoding community, the experimental results are based on the "geometry absent" profile from the incoming MPEG Immersive video standard. The results show that the quality of synthesized views using the proposed recursive block splitting outperforms that of the state-of-the-art approach.

키워드

과제정보

This work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2018-0-00207, Immersive Media Research Laboratory).

참고문헌

  1. M. Domanski, O. Stankiewicz, K. Wegner, and T. Grajek, Immersive visual media-MPEG-I: 360 video, virtual navigation and beyond, in Proc. Int. Conf. Syst., Signals Image Process. (Poznan, Poland), July 2017, pp. 1-9.
  2. A. Doumanoglou, D. Griffin, J. Serrano, N. Zioulis, T. K. Phan, D. Jimenez, D. Zarpalas, F. Alvarez, M. Rio, and P. Daras, Quality of experience for 3-D immersive media streaming, IEEE Trans. Broadcast 64 (2018), no. 2, 379-391. https://doi.org/10.1109/tbc.2018.2823909
  3. M. Tanimoto, FTV (free-viewpoint TV), in Proc. IEEE Int. Conf. Image Process. (Hong Kong), Sept. 2010, pp. 2393-2396.
  4. A. Schenkel, D. Bonatto, S. Fachada, H. L. Guillaume, and G. Lafruit, Natural scenes datasets for exploration in 6DoF navigation, in Proc. Int. Conf. 3D Immersion (Brussels, Belgium) Dec. 2018, pp. 1-8.
  5. O. Stankiewicz, M. Domanski, A. Dziembowski, A. Grzelka, D. Mieloch, and J. Samelak, A free-viewpoint television system for horizontal virtual navigation, IEEE T. Mult. 20 (2018), no. 8, 2182-2195. https://doi.org/10.1109/tmm.2018.2790162
  6. P. Goorts, M. Dumont, S. Rogmans, and P. Bekaert, An end-to-end system for free viewpoint video for smooth camera transitions, in Proc. Int. Conf. 3D Imaging (Liege, Belgium), Dec. 2012, pp. 1-7.
  7. D. Mieloch, O. Stankiewicz, and M. Domanski, Depth map estimation for free-viewpoint television and virtual navigation, IEEE Access 8 (2020), 5760-5776. https://doi.org/10.1109/access.2019.2963487
  8. G. Lafruit, D. Bonatto, C. Tulvan, M. Preda, and L. Yu, Understanding MPEG-I coding standardization in immersive VR/AR applications, SMPTE Motion Imaging J. 128 (2019), no. 10, 33-39.
  9. J. M. Boyce, R. Dore, A. Dziembowski, J. Fleureau, J. Jung, B. Kroon, B. Salahieh, V. K. Vadakital, and L. Yu, MPEG immersive video coding standard, Proc. IEEE 109 (2021), no. 9, 1521-1536. https://doi.org/10.1109/JPROC.2021.3062590
  10. B. Salahieh, and J. Boyce, MIV geometry absent, ISO/IEC JTC1/SC29/WG4 MPEG2020/M54874, Online. 2020.
  11. K. Muller, P. Merkle, and T. Wiegand, 3-D video representation using depth maps, Proc. IEEE 99 (2011), no. 4, 643-656. https://doi.org/10.1109/JPROC.2010.2091090
  12. P. Garus, F. Henry, J. Jung, T. Maugey, and C. Guillemot, Immersive video coding: should geometry information be transmitted as depth maps? IEEE Trans. Circuits Syst. Video Technol. (2021). https://doi.org/10.1109/TCSVT.2021.3100006
  13. P. Garus, J. Jung, T. Maugey, and C. Guillemot, Bypassing depth maps transmission for immersive video coding, in Proc. 2019 Picture Coding Symp. (Ningbo, China), 2019. https://doi.org/10.1109/PCS48520.2019.8954543
  14. H. Laga, L. V. Jospin, F. Boussaid, and M. Bennamoun, A survey on deep learning techniques for stereo-based depth estimation, IEEE Trans. Pattern. Anal. Machine Intell. (2020). https://doi.org/10.1109/TPAMI.2020.3032602
  15. A. Dziembowski, M. Domanski, A. Grzelka, D. Mieloch, J. Stankowski, and K. Wegner, The influence of a lossy compression on the quality of estimated depth maps, in Proc. Int. Conf. Syst. Image Process. (Bratislava, Slovakia), 2016. https://doi.org/10.1109/IWSSIP.2016.7502730
  16. D. Mieloch, D. Kloska, and M. Wozniak, Point-to-block matching in depth estimation, in Proc. Int. Conf. Central Eur. Comput. Graph., Visualization Computer Vision, 2021, pp. 153-144
  17. X. He, Q. Liu, and Y. Yang, MV-GNN: multi-view graph neural network for compression artifacts reduction, IEEE Trans. Image Proc. 29 (2020), 6829-6840. https://doi.org/10.1109/tip.2020.2994412
  18. S. Chen, Q. Liu, and Y. Yang, Adaptive multi-modality residual network for compression distorted multi-view depth video enhancement, IEEE Access 8 (2020) 97072-97081. https://doi.org/10.1109/access.2020.2996258
  19. D. Mieloch, A. Dziembowski, and M. Domanski, Depth map refinement for immersive video, IEEE Access 9 (2021) 10778-10788. https://doi.org/10.1109/ACCESS.2021.3050554
  20. B. Szydelko, D. Mieloch, A. Dziembowski, G. Lee, and J. Y. Jeong, Rectangular blocks in encoder-derived features for decoder-side depth estimation, ISO/IEC JTC1/SC29/WG4 MPEG2021/M56335, Online, 2021.
  21. G. Clare, P. Garus, F. Henry, B. Szydelko, D. Mieloch, A. Dziembowski, M. Domanski, G. Lee, and J. Y. Jeong, [MIV] Combination of m56626 and m56335 for Geometry Assistance SEI message, ISO/IEC JTC1/SC29/WG4 MPEG2021/M56950, Online, 2021.
  22. O. Stankiewicz, G. Lafruit, and M. Domanski, Chapter 1 - Multiview video: Acquisition, processing, compression and virtual view rendering, in Academic Press Library in Signal Processing, Academic Press, 2018, pp. 3-74. https://doi.org/10.1016/B978-0-12-811889-4.00001-4
  23. Common Test Conditions for MPEG Immersive Video, ISO/IEC JTC1/SC29/WG4 MPEG2021/N0085, Online, 2021.
  24. Test Model 9 for MPEG Immersive Video, ISO/IEC JTC1/SC29/WG4 MPEG2021/N0084, Online, 2021.
  25. A. Hornbarg, Handbook of Machine Vision, Wiley, 2007, pp. 46-47. https://doi.org/10.1002/9783527610136
  26. Manual of IVDE 3.0, ISO/IEC JTC1/SC29/WG4 MPEG2020/N0058, Online, 2021.
  27. G. Clare, P. Garus, and F. Henry, [MIV] Geometry Assistance SEI message, ISO/IEC JTC1/SC29/WG4 MPEG2021/M56626, Online, 2021.
  28. Text of ISO/IEC FDIS 23090-12 MPEG Immersive Video, ISO/IEC JTC1/SC29/WG4 MPEG2021/N0111, Online, 2021.
  29. R. Achanta and S. Susstrunk, Superpixels and Polygons using simple non-iterative clustering, in Proc. IEEE Conf. Comput. Vision Pattern Recogn. (Honolulu, HI, USA), July 2017, pp. 4895-4904. https://doi.org/10.1109/CVPR.2017.520