Benchmark for Deep Learning based Visual Odometry and Monocular Depth Estimation

Choi, Hyukdoo;

doi:10.7746/jkros.2019.14.2.114

로봇학회논문지 (The Journal of Korea Robotics Society)

제14권2호
/
Pages.114-121
/
2019
/
1975-6291(pISSN)
/
2287-3961(eISSN)

한국로봇학회 (Korea Robotics Society)

DOI QR Code

딥러닝 기반 영상 주행기록계와 단안 깊이 추정 및 기술을 위한 벤치마크

Benchmark for Deep Learning based Visual Odometry and Monocular Depth Estimation

최혁두

Choi, Hyukdoo (Department of Electronics and Information Engineering)

투고 : 2018.12.14
심사 : 2019.01.25
발행 : 2019.05.31

https://doi.org/10.7746/jkros.2019.14.2.114 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

This paper presents a new benchmark system for visual odometry (VO) and monocular depth estimation (MDE). As deep learning has become a key technology in computer vision, many researchers are trying to apply deep learning to VO and MDE. Just a couple of years ago, they were independently studied in a supervised way, but now they are coupled and trained together in an unsupervised way. However, before designing fancy models and losses, we have to customize datasets to use them for training and testing. After training, the model has to be compared with the existing models, which is also a huge burden. The benchmark provides input dataset ready-to-use for VO and MDE research in 'tfrecords' format and output dataset that includes model checkpoints and inference results of the existing models. It also provides various tools for data formatting, training, and evaluation. In the experiments, the exsiting models were evaluated to verify their performances presented in the corresponding papers and we found that the evaluation result is inferior to the presented performances.

키워드

참고문헌

L. Deng and D. Yu, "Deep Learning: Methods and Applications," Now Foundations and Trends, vol. 7, no. 3, pp. 197-387, 2014.
Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, pp. 436-444, 2015. https://doi.org/10.1038/nature14539
S. Wang, R. Clark, H. Wen, and N. Trigoni, "Deepvo: Towards end-to-end visual odometry with deep recurrent convolutional neural networks," 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, pp. 2043-2050, 2017.
R. Li, S. Wang, Z. Long, and D. Gu, "Undeepvo: Monocular visual odometry through unsupervised deep learning," 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, pp. 7286-7291, 2018.
T. Zhou, M. Brown, N. Snavely, and D. G. Lowe, "Unsupervised learning of depth and ego-motion from video," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, DOI: 10.1109/CVPR.2017.700.
H. Zhan, R. Garg, C. S. Weerasekera, K. Li, H. Agarwal, and I. M. Reid, "Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 340-349, 2018.
Z. Yin and J. Shi, "GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, DOI: 10.1109/CVPR.2018.00212.
J. Fuentes-Pacheco, J. Ruiz-Ascencio, and J. M. Rendon-Mancha, "Visual simultaneous localization and mapping: a survey," Artificial Intelligence Review, vol. 43, no. 1, pp. 55-81, Jan., 2015. https://doi.org/10.1007/s10462-012-9365-8
T. Taketomi, H. Uchiyama, and S. Ikeda, "Visual SLAM algorithms: a survey from 2010 to 2016," IPSJ Transactions on Computer Vision and Applications, vol. 9, no. 16, 2017, DOI: DOI 10.1186/s41074-017-0027-2.
R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, "ORB-SLAM: a versatile and accurate monocular SLAM system," IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147-1163, Oct., 2015. https://doi.org/10.1109/TRO.2015.2463671
J. Engel, T. Schops, and D. Cremers, "LSD-SLAM: Large-scale direct monocular SLAM," European Conference on Computer Vision, pp. 834-849, 2014.
R. Mahjourian, M. Wicke, and A. Angelova, "Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 5667-5675, 2018.
A. Saxena, S. H. Chung, and A. Y. Ng, "Learning depth from single monocular images," 18th International Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, pp. 1161-1168, 2005.
A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The KITTI dataset," The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231-1237, 2013. https://doi.org/10.1177/0278364913491297
D. Eigen, C. Puhrsch, and R. Fergus, "Depth map prediction from a single image using a multi-scale deep network," arXiv: 1406.2283 [cs.CV], 2014..
R. Mur-Artal and J. D. Tardos, "Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras," IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, Oct., 2017. https://doi.org/10.1109/TRO.2017.2705103
J. Engel, V. Koltun, and D. Cremers, "Direct sparse odometry," arXiv:1607.02565 [cs.CV], 2016.
J. Engel, J. Stuckler, and D. Cremers, "Large-scale direct SLAM with stereo cameras," 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, pp. 1935-1942, 2015.
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, "A benchmark for the evaluation of RGB-D SLAM systems," 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, pp. 573-580, 2012.

로봇학회논문지 (The Journal of Korea Robotics Society)

딥러닝 기반 영상 주행기록계와 단안 깊이 추정 및 기술을 위한 벤치마크

Benchmark for Deep Learning based Visual Odometry and Monocular Depth Estimation

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)