DOI QR코드

DOI QR Code

경량 깊이완성기술을 위한 효율적인 자기지도학습 기법 연구

Efficient Self-supervised Learning Techniques for Lightweight Depth Completion

  • 박재혁 (한국전자통신연구원 자율주행지능연구실) ;
  • 민경욱 (한국전자통신연구원 자율주행지능연구실) ;
  • 최정단 (한국전자통신연구원 지능로보틱스연구본부)
  • 투고 : 2021.10.27
  • 심사 : 2021.12.10
  • 발행 : 2021.12.31

초록

카메라와 라이다가 탑재된 자율주행 시스템에서 깊이완성기술을 통해 조밀한 깊이추정을 할 수 있다. 특히, 자기지도학습을 이용하면 깊이정답이 없는 주행데이터로도 깊이완성 네트워크의 학습이 가능하다. 실제 자율주행환경에서 이러한 깊이완성의 출력은 다른 알고리즘들의 입력으로 사용되므로 매우 빠른 지연속도를 요구한다. 그래서 본 논문에서는 종래의 연구들처럼 네트워크를 고도화하여 정확도를 높이기보단 추론속도를 극대화한 형태의 깊이완성 네트워크를 사용한다. GPU 연산에 최적화된 RegNet 인코더를 사용하고 네트워크의 병렬성을 고려한 U-Net 형태의 네트워크를 설계한다. 대신, 본 논문에서는 자기지도학습 과정에서 정확도를 높일 수 있는 몇 가지 기법들을 제시한다. 제시하는 기법들은 신뢰할 수 없는 라이다 입력에 대한 강인함을 높이고 사전에 추출한 시맨틱 정보를 바탕으로 에지와 하늘 영역에 대한 깊이 추정 품질을 향상시킨다. 실험을 통해 우리의 모델은 매우 경량임에도 (2.42ms at 1280x480) 노이즈에 강하며 최신 연구들과 대등한 정확도를 보임을 확인한다.

In an autonomous driving system equipped with a camera and lidar, depth completion techniques enable dense depth estimation. In particular, using self-supervised learning it is possible to train the depth completion network even without ground truth. In actual autonomous driving, such depth completion should have very short latency as it is the input of other algorithms. So, rather than complicate the network structure to increase the accuracy like previous studies, this paper focuses on network latency. We design a U-Net type network with RegNet encoders optimized for GPU computation. Instead, this paper presents several techniques that can increase accuracy during the process of self-supervised learning. The proposed techniques increase the robustness to unreliable lidar inputs. Also, they improve the depth quality for edge and sky regions based on the semantic information extracted in advance. Our experiments confirm that our model is very lightweight (2.42 ms at 1280x480) but resistant to noise and has qualities close to the latest studies.

키워드

과제정보

본 연구는 국토교통부/국토교통과학기술진흥원의 지원으로 수행되었음(과제번호 21AMDP-C161756-01).

참고문헌

  1. Acuna D., Kar A. and Fidler S.(2019), "Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.11067-11075.
  2. Cheng X., Wang P. and Yang R.(2018), "Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network," Proceedings of the European Conference on Computer Vision(ECCV).
  3. Eigen D., Puhrsch C. and Fergus R.(2014), "Depth Map Prediction from a Single Image using a Multi-Scale Deep Network," 27th International Conference on Neural Information Processing Systems(NIPS), pp.2366-2374.
  4. Godard C., Aodha O. M., Firman M. and Brostow G.(2019), "Digging Into Self-Supervised Monocular Depth Estimation," 2019 IEEE/CVF International Conference on Computer Vision(ICCV), pp.3827-3837.
  5. Guizilini V., Ambrus R., Pillai S., Raventos A. and Gaidon A.(2020), "3D Packing for Self-Supervised Monocular Depth Estimation," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp.2482-2491.
  6. Li A., Yuan Z., Ling Y., Chi W., Zhang S. and Zhang C.(2020), "A Multi-Scale Guided Cascade Hourglass Network for Depth Completion," 2020 IEEE Winter Conference on Applications of Computer Vision(WACV), pp.32-40.
  7. Ma F. and Karaman S.(2018), "Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image," 2018 IEEE International Conference on Robotics and Automation(ICRA), pp.4796-4803.
  8. Ma F., Cavalheiro G. V. and Karaman S.(2019), "Self-Supervised Sparse-to-Dense: Self-Supervised Depth Completion from LiDAR and Monocular Camera," 2019 International Conference on Robotics and Automation(ICRA), pp.3288-3295.
  9. Radosavovic I., Kosaraju R. P., Girshick R., He K. and Dollar P.(2020), "Designing Network Design Spaces," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp.10425-10433.
  10. Scharstein D. and Szeliski R.(2002), "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms," International Journal of Computer Vision, vol. 47, pp.7-42. https://doi.org/10.1023/A:1014573219977
  11. Wong A., Cicek S. and Soatto S.(2021), "Learning Topology From Synthetic Data for Unsupervised Depth Completion," In IEEE Robotics and Automation Letters, vol. 6, no. 2, pp.1495-1502. https://doi.org/10.1109/LRA.2021.3058072
  12. Wong A., Fei X., Tsuei S. and Soatto S.(2020), "Unsupervised Depth Completion From Visual Inertial Odometry," In IEEE Robotics and Automation Letters, vol. 5, no. 2, pp.1899-1906. https://doi.org/10.1109/lra.2020.2969938
  13. Yang Y., Wong A. and Soatto S.(2019), "Dense Depth Posterior(DDP) From Single Image and Sparse Range," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp.3348-3357.
  14. Yuan Y., Chen X. and Wang J.(2020), "Object-Contextual Representations for Semantic Segmentation," Proceedings of the European Conference on Computer Vision(ECCV).
  15. Zhou T., Brown M., Snavely N. and Lowe D. G.(2017), "Unsupervised Learning of Depth and Ego-Motion from Video," 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp.6612-6619.