Browse > Article
http://dx.doi.org/10.15701/kcgs.2020.26.3.61

SINGLE PANORAMA DEPTH ESTIMATION USING DOMAIN ADAPTATION  

Lee, Jonghyeop (POSTECH)
Son, Hyeongseok (POSTECH)
Lee, Junyong (POSTECH)
Yoon, Haeun (POSTECH)
Cho, Sunghyun (POSTECH)
Lee, Seungyong (POSTECH)
Abstract
In this paper, we propose a deep learning framework for predicting a depth map of a 360° panorama image. Previous works use synthetic 360° panorama datasets to train networks due to the lack of realistic datasets. However, the synthetic nature of the datasets induces features extracted by the networks to differ from those of real 360° panorama images, which inevitably leads previous methods to fail in depth prediction of real 360° panorama images. To address this gap, we use domain adaptation to learn features shared by real and synthetic panorama images. Experimental results show that our approach can greatly improve the accuracy of depth estimation on real panorama images while achieving the state-of-the-art performance on synthetic images.
Keywords
depth estimation; deep learning; domain adaptation; spherical panorama; single image;
Citations & Related Records
연도 인용수 순위
  • Reference
1 D. Huber and L. Tchapmi, "The sumo challenge," The 2019 SUMO Workshop $360^{\circ}$ Indoor Scene Understanding and Modeling.
2 J. Hoffman, E. Tzeng, T. Park, J.-Y. Zhu, P. Isola, K. Saenko, A. A. Efros, and T. Darrell, "CyCADA: Cycle-consistent adversarial domain adaptation," in Proc. ICML, 2018.
3 J. Xiao, K. A. Ehinger, A. Oliva, and A. Torralba, "Recognizing scene viewpoint using panoramic place representation," in Proc. CVPR, 2012.
4 C. Godard, O. M. Aodha, and G. J. Brostow, "Unsupervised monocular depth estimation with left-right consistency," in Proc. CVPR, 2016.
5 I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, "Deeper depth prediction with fully convolutional residual networks," in Proc. 3DV, 2016.
6 F. Liu, C. Shen, G. Lin, and I. D. Reid, "Learning depth from single monocular images using deep convolutional neural fields," IEEE TPAMI, vol. 38, no. 10, pp. 2024-2039, 2016.   DOI
7 J. H. Lee, M. Han, D. W. Ko, and I. H. Suh, "From big to small: Multi-scale local planar guidance for monocular depth estimation," ArXiv, 2019.
8 W. Yin, Y. Liu, C. Shen, and Y. Yan, "Enforcing geometric constraints of virtual normal for depth prediction," in Proc. CVPR, 2019.
9 D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in Proc. ICLR, 2015.
10 I. Armeni, S. Sax, A. R. Zamir, and S. Savarese, "Joint 2d-3d-semantic data for indoor scene understanding," ArXiv, vol. abs/1702.01105, 2017.
11 A. X. Chang, A. Dai, T. A. Funkhouser, M. Halber, M. NieBner, M. Savva, S. Song, A. Zeng, and Y. Zhang, "Matterport3d: Learning from rgb-d data in indoor environments," 2017 International Conference on 3D Vision (3DV), pp. 667-676, 2017.
12 A. Handa, V. Patraucean, S. Stent, and R. Cipolla, "Scenenet: An annotated model generator for indoor scene understanding," 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 5737-5743, 2016.
13 S. Song, F. Yu, A. Zeng, A. X. Chang, M. Savva, and T. A. Funkhouser, "Semantic scene completion from a single depth image," Proc. CVPR, pp. 190-198, 2016.
14 A. Geiger, P. Lenz, and R. Urtasun, "Are we ready for autonomous driving? the kitti vision benchmark suite," 2012.
15 N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, "Indoor segmentation and support inference from rgbd images," in Proc. ECCV, 2012.
16 J. Hu, M. Ozay, Y. Zhang, and T. Okatani, "Revisiting single image depth estimation: Toward higher resolution maps with accurate object boundaries," 2018.
17 Y.-C. Su and K. Grauman, "Learning spherical convolution for fast features from 360◦ imagery," ArXiv, vol. abs/1708.00919, 2017.
18 I. Armeni, O. Sener, A. R. Zamir, H. Jiang, I. K. Brilakis, M. Fischer, and S. Savarese, "3d semantic parsing of large-scale indoor spaces," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1534-1543, 2016.
19 I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio, "Generative adversarial nets," in NIPS, 2014.
20 K. Tateno, F. Tombari, I. Laina, and N. Navab, "Cnn-slam: Real-time dense monocular slam with learned depth prediction," in Proc. CVPR, 2017.
21 K. Karsch, K. Sunkavalli, S. Hadap, N. Carr, H. Jin, R. da Fonte, M. Sittig, and D. Forsyth, "Automatic scene inference for 3d object compositing," ACM TOG, vol. 33, no. 3, 2014.
22 J. Huang, Z. Chen, D. Ceylan, and H. Jin, "6-dof vr videos with a single 360-camera," in Proc. IEEE VR, 2017.
23 X. Ren, L. Bo, and D. Fox, "Rgb-(d) scene labeling: Features and algorithms," in Proc. CVPR, 2012.
24 N. Zioulis, A. Karakottas, D. Zarpalas, and P. Daras, "Omnidepth: Dense depth estimation for indoors spherical panoramas," in Proc. ECCV, 2018.
25 S. Im, H. Ha, F. Rameau, H.-G. Jeon, G. Choe, and I.-S. Kweon, "All-around depth from small motion with a spherical panoramic camera," in Proc. ECCV, 2016.
26 R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. USA: Cambridge University Press, 2003.
27 R. T. Collins, "A space-sweep approach to true multi-image matching," in Proc. CVPR, 1996.
28 M. Eder, P. Moulon, and L. Guan, "Pano popups: Indoor 3d reconstruction with a plane-aware network," in Proc. 3DV, 2019.
29 M. Eder, T. Price, T. Vu, A. Bapat, and J. Frahm, "Mapped convolutions," ArXiv, 2019.
30 N. Zioulis, A. Karakottas, D. Zarpalas, F. Alvarez, and P. Daras, "Spherical view synthesis for self-supervised $360^{\circ}$ depth estimation," in Proc. 3DV, 2019.