DOI QR코드

DOI QR Code

View synthesis with sparse light field for 6DoF immersive video

  • Kwak, Sangwoon (Media Research Division, Electronics and Telecommunications Research Institute) ;
  • Yun, Joungil (Media Research Division, Electronics and Telecommunications Research Institute) ;
  • Jeong, Jun-Young (Media Research Division, Electronics and Telecommunications Research Institute) ;
  • Kim, Youngwook (Department of Computer Science and Engineering, Sogang University) ;
  • Ihm, Insung (Department of Computer Science and Engineering, Sogang University) ;
  • Cheong, Won-Sik (Media Research Division, Electronics and Telecommunications Research Institute) ;
  • Seo, Jeongil (Media Research Division, Electronics and Telecommunications Research Institute)
  • Received : 2021.06.18
  • Accepted : 2021.12.17
  • Published : 2022.02.01

Abstract

Virtual view synthesis, which generates novel views similar to the characteristics of actually acquired images, is an essential technical component for delivering an immersive video with realistic binocular disparity and smooth motion parallax. This is typically achieved in sequence by warping the given images to the designated viewing position, blending warped images, and filling the remaining holes. When considering 6DoF use cases with huge motion, the warping method in patch unit is more preferable than other conventional methods running in pixel unit. Regarding the prior case, the quality of synthesized image is highly relevant to the means of blending. Based on such aspect, we proposed a novel blending architecture that exploits the similarity of the directions of rays and the distribution of depth values. By further employing the proposed method, results showed that more enhanced view was synthesized compared with the well-designed synthesizers used within moving picture expert group (MPEG-I). Moreover, we explained the GPU-based implementation synthesizing and rendering views in the level of real time by considering the applicability for immersive video service.

Keywords

Acknowledgement

This work was supported by the Institute for Information Communications Technology Planning and Evaluation (IITP) grant funded by the Korea government (MSIT) (no. 2017-0-00072, Development of Audio/Video Coding and Light Field Media Fundamental Technologies for Ultra Realistic Tera-media).

References

  1. N. Sabater et al., Dataset and pipeline for multi-view light-field video, in Proc. IEEE Int. Conf. Comput. Vision Pattern Recognit. Workshops (Honolulu, HI, USA), July 2017, pp. 30-40.
  2. R. S. Overbeck et al., A system for acquiring, processing, and rendering panoramic light field stills for virtual reality, ACM Trans. Graph. 37 (2019), no. 6, 1-15.
  3. E. Penner and L. Zhang, Soft 3D reconstruction for view synthesis, ACM Trans. Graph. 36 (2017), no. 6, 1-11.
  4. B. Mildenhall et al., Local light field fusion: Practical view synthesis with prescriptive sampling guidelines, ACM Trans. Graph. 38 (2019), no. 4, 1-14. https://doi.org/10.1145/3306346.3322980
  5. M. Broxton et al., Immersive light field video with a layered mesh representation, ACM Trans. Graph. 39 (2020), no. 4 86, 1-15.
  6. B. Mildenhall et al., NeRF: Representing scenes as neural radiance fields for view synthesis, in Proc. Eur. Conf. Comput. Vision, Aug. 2020, pp. 405-421.
  7. H. Shum and S. B. Kang, Review of image-based rendering techniques, in Proc. SPIE Visual Commun. Image Process. (Perth, Australia), 2000, pp. 2-13.
  8. M. Levoy and P. Hanrahan, Light field rendering, in Proc. Therapeutic 23rd Annu. Conf. Comput. Graphics Interactive Techniques, Aug. 1996, pp. 31-42.
  9. S. J. Gortler et al., The lumigraph, in Proc. Therapeutic 23rd Annu. Conf. Comput. Graphics Interactive Techniques, Aug. 1996, pp. 43-54.
  10. C. Buehler et al., Unstructured lumigraph rendering, in Proc. Therapeutic 28th Annu. Conf. Comput. Graphics Interactive Techniques (Los Angeles, CA, USA), Aug. 2001, pp. 425-432.
  11. M. L. Davis and F. Durand, Unstructured light fields, in Comput, Graph. Forum 31 (2012), 305-314. https://doi.org/10.1111/j.1467-8659.2012.03009.x
  12. P. Srinivasan et al., NeRV: Neural reflectance and visibility fields for relighting and view synthesis, in Proc. IEEE Int. Conf. Comput. Vision Pattern Recognit. (Nashville, TN, USA), Nov. 2021, pp. 7495-7504.
  13. J. T. Barron et al., Mip-NeRF: A multiscale representation for anti-aliasing neural radiance fields, in Proc. Int. Conf. Comput. Vision, Oct. 2021, pp. 5855-5864.
  14. C. Reiser et al., KiloNeRF: Speeding up neural radiance fields with thousands of tiny MLPs, in Proc. Int. Conf. Comput. Vision, Oct. 2021, pp. 14335-14345.
  15. C. Fehn, Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV, in Proc. SPIE Stereoscopic Displays Virtual Reality Syst. (San Jose, CA, USA), 2004, pp. 93-104.
  16. K. Muller et al., View synthesis for advanced 3D video systems, EURASIP J. Image Video Process. 2008 (2009), 438148:1-11. https://doi.org/10.1155/2008/438148
  17. Y. Mori et al., View generation with 3D warping using depth information for FTV, Signal Process.: Image Commun. 24 (2009), no. 1-2, 65-72. https://doi.org/10.1016/j.image.2008.10.013
  18. D. Tian et al., View synthesis techniques for 3D video, in Proc. SPIE Applicat. Digital Image Process. XXXII (San Diego, CA, USA), 2009, pp. 74430T:1-11.
  19. Y. J. Jeong, Y. J. Jung, and D.-S. Park, Depth image-based rendering for multiview generation, J. Soc. Inf. Disp. 18 (2010), no. 4, 310-316. https://doi.org/10.1889/JSID18.4.310
  20. S.-W. Nam et al., Hole-filling methods using depth and color information for generating multiview images, ETRI J. 38 (2016), no. 5, 996-1007. https://doi.org/10.4218/etrij.16.0116.0062
  21. J. M. Boyce et al., MPEG immersive video coding standard, Proc. IEEE 109 (2021), no. 9, 1521-1536. https://doi.org/10.1109/JPROC.2021.3062590
  22. S. Fachada et al., Depth image based view synthesis with multiple reference views for virtual reality, in Proc. 2018-3DTVConf.: True Vision-Capture, Transmission Display of 3D (video), pp. 1-4.
  23. ISO/IEC JTC1 SC29/WG11, Reference view synthesizer (RVS) manual, N18068, 2018.
  24. ISO/IEC JTC1 SC29/WG11, Versatile view synthesizer (VVS) 2.0 manual, N18172, 2019.
  25. ISO/IEC JTC1 SC29/WG4, Test model 9 for MPEG immersive video, N00084, 2021.
  26. T. Tezuka, K. Takahashi, and T. Fujii, Superpixel-based 3D warping using view plus depth data from multiple viewpoints, in Proc. SPIE Stereoscopic Displays Applicat. XXV (San Francisco), 2014 pp. 90111V:1-9.
  27. M. P. Tehrani et al., Free-viewpoint image synthesis using super-pixel segmentation, APSIPA Trans. Signal Inf. Process. 6 (2017), 1-12.
  28. A. Dziembowski et al., Multiview synthesis-Improved view synthesis for virtual navigation, in Proc. Picture Coding Symp. (Nuremberg, Germany), Dec. 2016, pp. 1-5.
  29. J. C. Seong, K. A. Mulcahy, and E. L. Usery, The sinusoidal projection: A new importance in relation to global image data, Prof. Geogr. 52 (2002), 218-225. https://doi.org/10.1111/0033-0124.00219
  30. N. Max, Optical models for direct volume rendering, IEEE Trans. Vis. Comput. Graph. 1 (1995), 99-108. https://doi.org/10.1109/2945.468400
  31. ISO/IEC JTC1 SC29/WG4, Common test conditions for MPEG immersive video, N00085, 2021.
  32. Y. Sun, A. Lu, and L. Yu, Weighted-to-spherically uniform quality evaluation for omnidirectional video, IEEE Signal Process. Lett. 24 (2017), no. 9, 1408-1412. https://doi.org/10.1109/LSP.2017.2720693