DOI QR코드

DOI QR Code

Improving Detection Range for Short Baseline Stereo Cameras Using Convolutional Neural Networks and Keypoint Matching

컨볼루션 뉴럴 네트워크와 키포인트 매칭을 이용한 짧은 베이스라인 스테레오 카메라의 거리 센싱 능력 향상

  • Byungjae Park (School of Mechanical Engineering, Korea University of Technology and Education)
  • 박병재 (한국기술교육대학교 기계공학부)
  • Received : 2024.03.07
  • Accepted : 2024.03.21
  • Published : 2024.03.31

Abstract

This study proposes a method to overcome the limited detection range of short-baseline stereo cameras (SBSCs). The proposed method includes two steps: (1) predicting an unscaled initial depth using monocular depth estimation (MDE) and (2) adjusting the unscaled initial depth by a scale factor. The scale factor is computed by triangulating the sparse visual keypoints extracted from the left and right images of the SBSC. The proposed method allows the use of any pre-trained MDE model without the need for additional training or data collection, making it efficient even when considering the computational constraints of small platforms. Using an open dataset, the performance of the proposed method was demonstrated by comparing it with other conventional stereo-based depth estimation methods.

Keywords

Acknowledgement

이 논문은 2023년도 정부(과학기술정보통신부)의 재원으로 한국과학재단의 지원을 받아 수행된 연구임 (No. 2021R1F1A1057949). 본 과제(결과물)는 2023년도 교육부의 재원으로 한국연구재단의 지원을 받아 수행된 지자체-대학 협력기반 지역혁신 사업의 결과입니다 (2021RIS-004).

References

  1. J. Kim, Y. Cho, and A. Kim, "Proactive Camera Attribute Control Using Bayesian Optimization for Illumination- Resilient Visual Navigation", IEEE Trans. Robot., Vol. 36, No. 4, pp. 1256-1271, 2020.
  2. J. Kim, M.-H. Jeon, Y. Cho, and A. Kim, "Dark Synthetic Vision: Lightweight Active Vision to Navigate in the Dark", IEEE Robot. Autom Lett., Vol. 6, No. 1, pp. 143-150, 2020. https://doi.org/10.1109/LRA.2020.3035137
  3. J.-R. Chang and Y.-S. Chen, "Pyramid Stereo Matching Network", Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5410-5418, Salt Lake City, USA, 2018.
  4. J.-R. Chang, P.-C. Chen, and Y.-S. Chen, "Attention- Aware Feature Aggregation for Real-time Stereo Matching on Edge Devices", Proc. of Asian Conference on Computer Vision (ACCV), pp. 1-16, 2020.
  5. C. Godard, O. M. Aodha, M. Firman, and G. J. Brostrowm, "Digging into Self-Supervised Monocular Depth Prediction", Proc. of International Conference Computer Vision (ICCV), pp. 3828-2838, Seoul, Korea, 2019.
  6. H. Hirschmuller, "Stereo Processing by Semiglobal Matching and Mutual Information", IEEE Trans. Pattern Anal. Mach. Intell., Vol. 30, No. 2, pp. 328-341, 2008. https://doi.org/10.1109/TPAMI.2007.1166
  7. A. Geiger, M. Roser, and R. Urtasun, "Efficient Large-Scale Stereo Matching", Proc. of Asian Conference on Computer Vision (ACCV), pp. 25-38, Queenstown, New Zealand, 2010.
  8. D. Sun, X. Yang. M.-Y. Liu, and J. Kautz, "PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume", Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8934-8943, Salt Lake City, USA, 2018.
  9. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, "Vision meets robotics: The KITTI dataset", Int. J. Rob. Res., Vol. 32, No. 11, pp. 1231-1237, 2013.
  10. Z. Li and N. Snavely, "MegaDepth: Learning Single- View Depth Prediction from Internet Photos", Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2041-2050, Salt Lake City, USA, 2018.
  11. R. Ranftl, A. Bochkovskiy, and V. Koltun, "Vision Transformers for Dense Prediction", Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 12179-12188, 2021.
  12. J. H. Lee, M.-K. Han, D. W. Ko, and I. H. Suh, "From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation", arXiv preprint arXiv:1907.10326, pp. 1-11, 2021.
  13. S. F. Bhat, I. Alhashim, and P. Wonka, "AdaBins: Depth Estimation using Adaptive Bins", arXiv preprint arXiv: 2011.14141, pp. 1-13, 2020.
  14. D. Wofk, F. Ma, T.-J, Yang, S. Karaman, and V. Sze, "Fast-Depth: Fast Monocular Depth Estimation on Embedded Systems", Proc. of International Conference on Robotics and Automation (ICRA), pp. 6101-6108, Montreal, Canada, 2019.
  15. J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks", Proc. of the IEEE International Conference on Computer Vision (ICCV), pp. 2223-2232, Venice, Italy, 2017.
  16. X. Huang, M.-Y. Liu, S. Belongie, and J. Kautz, "Multimodal Unsupervised Image-to-Image Translation", Proc. of the European Conference on Computer Vision (ECCV), pp. 172-189, Munich, Germany, 2018.
  17. E. Rosten, R. Porter, and T. Drummond, "Faster and better: a machine learning approach to corner detection", IEEE Trans. Pattern Anal. Mach. Intell., Vol. 32, No. 1, pp. 105-119, 2008. https://doi.org/10.1109/TPAMI.2008.275
  18. P. F. Alcantarilla, J. Nuevo, and A. Bartoli, "Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces", Proc. of the 24th British Machine Vision Conference (BMVC), pp. 1281-1298, Bristol, UK, 2013.
  19. A. Geiger, J. Ziegler, and C. Stiller, "StereoScan: Dense 3D Reconstruction in Real-time", Proc. of IEEE Intelligent Vehicles Symposium (IV), pp. 963-968, Baden-Baden, Germany 2011.
  20. A. Kendall, V. Badrinarayanan, and R. Cipolla, "Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding", arXiv preprint arXiv:1511.02680, pp. 1-11, 2015.
  21. M. Sheeny, E. D. Pellegrin, S. Mukherjee, A. Ahrabian, S. Wang, and A. Wallace, "RADIATE: A Radar Dataset for Automotive Perception in Bad Weather", arXiv preprint arXiv:2010.09076, pp. 1-15, 2020.