DOI QR코드

DOI QR Code

Predicting Unseen Object Pose with an Adaptive Depth Estimator

적응형 깊이 추정기를 이용한 미지 물체의 자세 예측

  • Received : 2022.07.21
  • Accepted : 2022.09.18
  • Published : 2022.12.31

Abstract

Accurate pose prediction of objects in 3D space is an important visual recognition technique widely used in many applications such as scene understanding in both indoor and outdoor environments, robotic object manipulation, autonomous driving, and augmented reality. Most previous works for object pose estimation have the limitation that they require an exact 3D CAD model for each object. Unlike such previous works, this paper proposes a novel neural network model that can predict the poses of unknown objects based on only their RGB color images without the corresponding 3D CAD models. The proposed model can obtain depth maps required for unknown object pose prediction by using an adaptive depth estimator, AdaBins,. In this paper, we evaluate the usefulness and the performance of the proposed model through experiments using benchmark datasets.

3차원 공간에서 물체들의 정확한 자세 예측은 실내외 환경에서 장면 이해, 로봇의 물체 조작, 자율 주행, 증강 현실 등과 같은 많은 응용 분야들에서 폭넓게 활용되는 중요한 시각 인식 기술이다. 물체들의 자세 예측을 위한 과거 연구들은 대부분 각 인식 대상 물체마다 정확한 3차원 CAD 모델을 요구한다는 한계점이 있었다. 이러한 과거 연구들과는 달리, 본 논문에서는 3차원 CAD 모델이 없어도 RGB 컬러 영상들만 이용해서 미지 물체들의 자세를 예측해낼 수 있는 새로운 신경망 모델을 제안한다. 제안 모델은 적응형 깊이 추정기인 AdaBins를 이용하여 스스로 미지 물체 자세 예측에 필요한 각 물체의 깊이 지도를 효과적으로 추정해낼 수 있다. 벤치마크 데이터 집합들을 이용한 다양한 실험들을 통해, 본 논문에서 제안한 모델의 유용성과 성능을 평가한다.

Keywords

Acknowledgement

본 연구는 정보통신기획평가원의 재원으로 정보통신방송 기술개발사업의 지원을 받아 수행한 연구과제(No. 2020-0-00096. 클라우드에 연결된 개별로봇 및 로봇그룹의 작업 계획 기술 개발)입니다. 또한, 2022년도 정부(산업통상자원부)의 재원으로 한국산업기술진흥원의 지원을 받아 수행된 연구임(P0008691, 2022년 산업혁신인재성장지원사업)

References

  1. Z. He, W. Feng, X. Zhao, and Y. Lv, "6D Pose prediction of objects: Recent technologies and challenges," Applied Science, Vol.11, No.1, pp.228, 2021.
  2. K. Park, A. Mousavian, Y. Xiang, and D. Fox, "LatentFusion: End-to-End differentiable reconstruction and rendering for unseen object pose prediction," Proceedings of IEEE Computer Vision and Pattern Conference, 2019.
  3. H. Wang, S. Sridhar, J. Huang, J. Valentin, S. Song, and L. J. Guibas, "Normalized object coordinate space for category-level 6D object pose and size estimation," Proceedings of IEEE Computer Vision and Pattern Conference, 2019.
  4. M. Tian, M. H. Ang Jr, and G. H. Lee, "Shape prior deformation for categorical 6D object pose and size estimation," Proceedings of European Conference on Computer Vision, 2020.
  5. C. Lee, D. Shim, and H. Kim, "Deep learning based monocular depth estimation: Survey," Journal of Positioning Navigation and Timing, Vol.10, No.4, pp.297-305, 2021.
  6. S. F. Bhat, I. Alhashim, and P. Wonka, "AdaBins: Depth estimation using adaptive bins," Proceedings of IEEE Computer Vision and Pattern Conference, 2021.
  7. B. Tekin, S. N. Sinha, and P. Fua, "Real-time seamless single shot 6d object pose prediction," Proceedings of IEEE Computer Vision and Pattern Conference, 2018.
  8. J. Tremblay, T. To, B. Sundaralingam, Y. Xiang, D. Fox, and S. Birchfield, "Deep object pose estimation for semantic robotic grasping of household objects," Proceedings of Conference Robot Learning, 2018.
  9. Y. Li, G. Wang, X. Ji, Y. Xiang, and D. Fox, "DeepIM: Deep iterative matching for 6d pose prediction," Proceedings of European Conference on Computer Vision, 2018.
  10. F. Manhardt, W. Kehl, N. Navab, and F. Tombari, "Deep model-based 6D pose refinement in RGB," Proceedings of European Conference on Computer Vision, 2018.
  11. C. Li, J. Bai, and G. D. Hager, "A unified framework for multi-view multi-class object pose estimation," Proceedings of European Conference on Computer Vision, 2018.
  12. C. Wang, D. Xu, Y. Zhu, R. Martin-Martin, C. Lu, L. Fei-Fei, and S. Savarese, "DenseFusion: 6D object pose estimation by iterative dense fusion," Proceedings of IEEE Computer Vision and Pattern Conference, 2019.
  13. E. Sucar, K. Wada, and A. Davison, "NodeSLAM: Neural object descriptors for multi-view shape reconstruction," Proceedings of 2020 International Conference on 3D Vision (3DV), 2020.
  14. X. Chen, Z. Dong, J. Song, A. Geiger, and O. Hilliges, "Category level object pose estimation via neural analysis-by-synthesis," Proceedings of European Conference on Computer Vision, 2020.
  15. D. Eigen, C. Puhrsch, and R. Fergus, "Depth map prediction from a single image using a multi-scale deep network," Proceedings of NeurIPS, 2014.
  16. I. Laina, C. Rupprecht, V. Belagiannis, F. Tombari, and N. Navab, "Deeper depth prediction with fully convolutional residual networks," Proceedings of 2016 Fourth International Conference on 3D Vision (3DV), 2016.
  17. I. Alhashim, and P. Wonka. "High quality monocular depth estimation via transfer learning," arXiv preprint arXiv: 1812.1194, 2018.
  18. L. Huynh, P. Nguyen-Ha, J. Matas, E. Rahtu, and J. Heikkila, "Guiding monocular depth estimation using depthattention volume." Proceedings of European Conference on Computer Vision, 2020.
  19. J. Lee, M. Han, D. Ko, and I. Suh, "From big to small: Multi-scale local planar guidance for monocular depth estimation," arXiv preprint arXiv:/1907.10326, 2019.
  20. T. Zhou, M. Brown, N. Snavely, and D. Lowe, "Unsupervised learning of depth and ego-motion from video," Proceedings of IEEE Computer Vision and Pattern Conference, 2017.
  21. Z. Yin and J. Shi, "Geonet: Unsupervised learning of dense depth, optical flow and camera pose," Proceedings of IEEE Computer Vision and Pattern Conference, 2018.
  22. B. Li, Y. Huang, Z. Liu, D. Zou, and W. Yu, "StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation," Proceedings of IEEE International Conference on Coputer Vision, 2021.
  23. W. Yin, Y. Liu, and C. Shen, "Virtual Normal: Enforcing geometric constraints for accurate and robust depth prediction," Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  24. J. Bian, H. Zhan, N. Wang, T. Chin, C. Shen, and I. Reid, "Auto-Rectify network for unsupervised indoor depth estimation," Proceedings of IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.