Browse > Article
http://dx.doi.org/10.11003/JPNT.2021.10.4.297

Deep Learning Based Monocular Depth Estimation: Survey  

Lee, Chungkeun (Department of Aerospace Engineering, Seoul National University)
Shim, Dongseok (Department of Aerospace Engineering, Seoul National University)
Kim, H. Jin (Department of Aerospace Engineering, Seoul National University)
Publication Information
Journal of Positioning, Navigation, and Timing / v.10, no.4, 2021 , pp. 297-305 More about this Journal
Abstract
Monocular depth estimation helps the robot to understand the surrounding environments in 3D. Especially, deep-learning-based monocular depth estimation has been widely researched, because it may overcome the scale ambiguity problem, which is a main issue in classical methods. Those learning based methods can be mainly divided into three parts: supervised learning, unsupervised learning, and semi-supervised learning. Supervised learning trains the network from dense ground-truth depth information, unsupervised one trains it from images sequences and semi-supervised one trains it from stereo images and sparse ground-truth depth. We describe the basics of each method, and then explain the recent research efforts to enhance the depth estimation performance.
Keywords
deep learning; depth estimation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., & Navab, N. 2016, Deeper Depth Prediction with Fully Convolutional Residual Networks, in Fourth International Conference on 3D Vision, Stanford, CA, 25-28 Oct 2016, pp.239-248. https://doi.org/10.1109/3DV.2016.32   DOI
2 Almalioglu, Y., Saputra, M. R. U., de Gusmao, P. P. B., Markham, A., & Trigoni, N. 2019, Ganvo: Unsupervised deep monocular visual odometry and depth estimation with generative adversarial networks, in International Conference on Robotics and Automation, Montreal, QC, May 2019, pp. 5474-5480. https://doi.org/10.1109/ICRA.2019.8793512   DOI
3 Forster, C., Pizzoli, M., & Scaramuzza, D. 2014, SVO: Fast semi-direct monocular visual odometry, in IEEE international conference on robotics and automation, Hong Kong, China, Jun 2014, pp.15-22. https://doi.org/10.1109/ICRA.2014.6906584   DOI
4 Geiger, A., Lenz, P., & Urtasun, R. 2012, Are we ready for autonomous driving? the kitti vision benchmark suite, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Providence, RI, Jun 2012, pp.3354-3361. https://doi.org/10.1109/CVPR.2012.6248074   DOI
5 He, K., Zhang, X., Ren, S., & Sun, J. 2016, Deep Residual Learning for Image Recognition, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 27-30 Jun 2016, pp.770-778. https://doi.org/10.1109/CVPR.2016.90   DOI
6 Lee, J. & Kim, C. 2019, Monocular Depth Estimation Using Relative Depth Maps, in Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, 15-20 Jun 2019, pp.9729-9738. https://doi.org/10.1109/CVPR.2019.00996   DOI
7 Silberman, N., Hoiem, D., Kohli, P., & Fergus, R., 2012, Indoor Segmentation and Support Inference from RGBD Images, in European Conference on Computer Vision, Firenze, Italy, Oct 2012, pp.746-760. https://doi.org/10.1007/978-3-642-33715-4_54   DOI
8 Yin, Z. & Shi, J. 2018, GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose, in Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 18-23 Jun 2018, pp.1983-1992. https://doi.org/10.1109/CVPR.2018.00212   DOI
9 Zhou, T., Brown, M., Snavely, N., & Lowe, D. G. 2017, Unsupervised Learning of Depth and Ego-Motion from Video, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, 21-26 July 2017, pp.1851-1858. https://doi.org/10.1109/CVPR.2017.700   DOI
10 Wofk, D., Ma, F., Yang, T. J., Karaman, S., & Sze, V. 2019, FastDepth: Fast Monocular Depth Estimation on Embedded Systems, in International Conference on Robotics and Automation, Montreal, QC, 20-24 May 2019, pp.6101-6108. https://doi.org/10.1109/ICRA.2019.8794182   DOI
11 Kundu, J. N., Uppala, P. K., Pahuja, A., & Babu, R. V. 2018, AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation, in Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 18-23 Jun 2018, pp.2656-2665. https://doi.org/10.1109/CVPR.2018.00281   DOI
12 Godard, C., Aodha, O. M., & Brostow, G. J. 2017, Unsupervised Monocular Depth Estimation with Left-Right Consistency, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, July 2017, pp.270-279. https://doi.org/10.1109/CVPR.2017.699   DOI
13 Fu, H., Gong, M., Wang, C., Batmanghelich, K., & Tao, D. 2018, Deep Ordinal Regression Network for Monocular Depth Estimation, in Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, Jun 2018, pp.2002-2011. https://doi.org/10.1109/CVPR.2018.00214   DOI
14 Garg, R., Bg, V. K., Carneiro, G., & Reid, I. 2016, Unsupervised Cnn for Single View Depth Estimation: Geometry to the Rescue, in European conference on computer vision, Amsterdam, the Netherlands, Oct 2016, pp.740-756. https://doi.org/10.1007/978-3-319-46484-8_45   DOI
15 Godard, C., Aodha, O. M., Firman, M., & Brostow, G. J. 2019, Digging Into Self-Supervised Monocular Depth Estimation, in Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, Oct 2019, pp.3828-3838. https://doi.org/10.1109/ICCV.2019.00393   DOI
16 Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., et al. 2017, Mobilenets: Efficient convolutional neural networks for mobile vision applications, Jun 9, Retrieved from https://arxiv.org/abs/1704.04861
17 Jaderberg, M., Simonyan, K., Zisserman, A., & Kavukcuoglu, K. 2015, in Advances in Neural Information Processing Systems, Montreal, CA, Dec 2015, pp.2017-2025. https://dl.acm.org/doi/abs/10.5555/2969442.2969465
18 Aleotti, F., Tosi, F., Poggi, M., & Mattoccia, S. 2018, Generative adversarial networks for unsupervised monocular depth prediction, in Proceedings of the European Conference on Computer Vision Workshops, Munich, Germany, Sep 2018, pp.337-354. https://doi.org/10.1007/978-3-030-11009-3_20   DOI
19 Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., et al. 2016, The cityscapes dataset for semantic urban scene understanding, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NV, Jun 2016, pp.3213-3223. https://doi.org/10.1109/CVPR.2016.350   DOI
20 Eigen, D., Puhrsch, C., & Fergus, R. 2014, Depth map prediction from a single image using a multi-scale deep network, in Advances in Neural Information Processing Systems, Cambridge, MA, Dec 2014, pp.2366-2374. https://dl.acm.org/doi/10.5555/2969033.2969091   DOI
21 Zheng, C., Cham, T. J., & Cai, J. 2018, T2net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks, in Proceedings of the European Conference on Computer Vision, Munich, Germany, 8-14 Sep 2018, pp.798-814. https://doi.org/10.1007/978-3-030-01234-2_47   DOI
22 Kuznietsov, Y., Stuckler, J., & Leibe, B. 2017, Semi-Supervised Deep Learning for Monocular Depth Map Prediction, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, 21-26 July 2017, pp.6647-6655. https://doi.org/10.1109/CVPR.2017.238   DOI
23 Mur-Artal, R., Montiel, J. M. M., & Tardos, J. D. 2015, ORBSLAM: a Versatile and Accurate Monocular SLAM system, IEEE transactions on robotics, 31, 1147-1163. https://doi.org/10.1109/TRO.2015.2463671   DOI
24 Ranftl, R., Vineet, V., Chen, Q., & Koltun, V. 2016, Dense Monocular Depth Estimation in Complex Dynamic Scenes, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 27-30 Jun 2016, pp.4058-4066. https://doi.org/10.1109/CVPR.2016.440   DOI
25 Wang, R., Pizer, S. M., & Frahm, J. 2019, Recurrent Neural Network for (Un-)Supervised Learning of Monocular Video Visual Odometry and Depth, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, Jun 2019, pp.5555-5564. https://doi.org/10.1109/CVPR.2019.00570   DOI
26 Zhao, S., Fu, H., Gong, M., & Tao, D. 2019, Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, 15-20 Jun 2019, pp.9788-9798. https://doi.org/10.1109/CVPR.2019.01002   DOI
27 Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. 2017, Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks, in Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22-29 Oct 2017, pp.2223-2232. https://doi.org/10.1109/ICCV.2017.244   DOI