DOI QR코드

DOI QR Code

단안 비디오로부터의 5차원 라이트필드 비디오 합성

5D Light Field Synthesis from a Monocular Video

  • Bae, Kyuho (Inha University, Department of Information and Communication Engineering) ;
  • Ivan, Andre (Inha University, Department of Information and Communication Engineering) ;
  • Park, In Kyu (Inha University, Department of Information and Communication Engineering)
  • 투고 : 2019.07.28
  • 심사 : 2019.09.10
  • 발행 : 2019.09.30

초록

현재 사용 가능한 상용 라이트필드 카메라는 정지 영상만을 취득하거나 가격이 매우 높은 단점으로 인하여 5차원 라이트필드 비디오 취득에 어려움이 있다. 이러한 문제점을 해결하기 위해 본 논문에서는 단안 비디오로부터 라이트필드 비디오를 합성하기 위한 딥러닝 기반 기법을 제안한다. 라이트필드 비디오 학습 데이터를 취득하기 어려운 문제를 해결하기 위하여 UnrealCV를 활용하여 3차원 그래픽 장면의 사실적 렌더링에 의한 합성 라이트필드 데이터를 취득하고 이를 학습에 사용한다. 제안하는 딥러닝 프레임워크는 입력 단안 비디오에서 $9{\times}9$의 각 SAI(sub-aperture image)를 갖는 라이트필드 비디오를 합성한다. 제안하는 네트워크는 밝기 영상으로 변환된 입력 영상으로부터 appearance flow를 추정하는 네트워크, appearance flow로부터 얻어진 인접한 라이트필드 비디오 프레임간의 optical flow를 추정하는 네트워크로 구성되어 있다.

Currently commercially available light field cameras are difficult to acquire 5D light field video since it can only acquire the still images or high price of the device. In order to solve these problems, we propose a deep learning based method for synthesizing the light field video from monocular video. To solve the problem of obtaining the light field video training data, we use UnrealCV to acquire synthetic light field data by realistic rendering of 3D graphic scene and use it for training. The proposed deep running framework synthesizes the light field video with each sub-aperture image (SAI) of $9{\times}9$ from the input monocular video. The proposed network consists of a network for predicting the appearance flow from the input image converted to the luminance image, and a network for predicting the optical flow between the adjacent light field video frames obtained from the appearance flow.

키워드

과제정보

연구 과제 주관 기관 : 삼성전자

참고문헌

  1. Raytrix 3D Light Field Cameras, https://raytrix.de/products
  2. M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, and M. Kudlur, "Tensor-flow: A system for large-scale machine learning," In Proc. of 12th Symposium on Operating Systems Design and Implementation, volume 16, pages 265-283, 2016.
  3. N. K. Kalantari, T.-C. Wang, and R. Ramamoorthi, "Learning-based view synthesis for light field cameras," ACM Transactions on Graphics, 35(6): 193, 2016.
  4. P. P. Srinivasan, T. Wang, A. Sreelal, R. Ramamoorthi, and R. Ng, "Learning to synthesize a 4D RGBD light field from a single image," In Proc. of IEEE International Conference on Computer Vision, pages 2243-2251, 2017.
  5. A. Ivan, Willem, and I. K. Park, "Synthesizing a 4D spatio-angular consistent light field from a single image," arXiv preprint arXiv:1903.12364, 2019.
  6. T.-C. Wang, J.-Y. Zhu, N. K. Kalantari, A. A. Efros, and R. Ramamoorthi, "Light field video capture using a learning-based hybrid imaging system," ACM Transactions on Graphics, 36(4): 133, 2017.
  7. W. Qiu and Y. Alan,"UnrealCV: Connecting computer vision to unreal engine," In Proc. of European Conference on Computer Vision, pages 909-916, 2016.
  8. M. Jaderberg, K. Simonyan, and A. Zisserman, "Spatial transformer networks," In Proc. of Advances in Neural Information Processing Systems, pages 2017-2025, 2015.
  9. D. P. Kingma and J. B. Adam, "Adam: A method for stochastic optimization," In Proc. of International Conference on Machine Learning, 2015.
  10. B. Wilburn, N. Joshi, V. Vaish, E. -V. Talvala, E. Antunez, A. Barth, A. Adams, M. Horowitz, and M. Levoy, "High performance imaging using large camera arrays," ACM Transactions on Graphics, 24(3), pages 765-776, 2005. https://doi.org/10.1145/1073204.1073259
  11. G. Wu, M. Zhao, L. Wang, Q. Dai, T. Chai, and Y. Liu, "Light field reconstruction using deep convolutional network on EPI," In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pages 6319-6327, 2017.
  12. H. Wing Fung Yeung, J. Hou, J. Chen, Y. Ying Chung, and X. Chen, "Fast light field reconstruction with deep coarse-to-fine modelling of spatial-angular clues," In Proc. of European Conference on Computer Vision, pages 137-152, 2018.
  13. T. Zhou, R. Tucker, J. Flynn, G. Fyffe, and N. Snavely, "Stereo magnification: Learning view synthesis using multiplane images," ACM Transactions on Graphics, 37(4):65:1-65:12, 2018.
  14. P. P. Srinivasan, R. Tucker, J. T. Barron, R. Ramamoorthi, R. Ng, and N. Snavely, "Pushing the boundaries of view extrapolation with multiplane images," In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pages 175-184, 2019.
  15. B. Mildenhall, P. P. Srinivasan, R. Ortiz-Cayon, N. K. Kalantari, R. Ramamoorthi, R. Ng, and A. Kar, "Local light field fusion: Practical view synthesis with prescriptive sampling guidelines," ACM Transactions on Graphics, 38(4):29:1-29:14, 2019.
  16. T. Zhou, S. Tulsiani, W. Sun, J Malik, and A. A. Efros, "View synthesis by appearance flow," In Proc. of European Conference on Computer Vision, pages 286-301, 2016.
  17. H. Schilling, M. Diebold, C. Rother, and B. Jhne, "Trust your model: Light field depth estimation with inline occlusion handling," In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pages 4530-4538, 2018
  18. C. Shin, H.-G. Jeon, Y. Yoon, I. S. Kweon, and S. J. Kim, "Epinet: A fully-convolutional neural network using epipolar geometry for depth from light field images," In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pages 4748-4757, 2018.
  19. Williem, and I. K. Park, "Robust light field depth estimation for noisy scene with occlusion," In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pages 4396-4404, 2016.
  20. Williem, I. K. Park, and K. M. Lee, "Robust light field depth estimation using occlusion-noise aware data costs," IEEE Transactions on Pattern Analysis and Machine Intelligence, (10):2484-2497, 2018.
  21. R. Ng, M. Levoy, M. Brdif, G. Duval, M. Horowitz, and P. Hanrahan, "Light field photography with a hand-held plenoptic camera," Computer Science Technical Report CSTR, 2(11):1-11, 2005.