Segmentation-Based Depth Map Adjustment for Improved Grasping Pose Detection

Hyunsoo Shin;Muhammad Raheel Afzal;Sungon Lee;

doi:10.7746/jkros.2024.19.1.016

The Journal of Korea Robotics Society (로봇학회논문지)

Volume 19 Issue 1
/
Pages.16-22
/
2024
/
1975-6291(pISSN)
/
2287-3961(eISSN)

Korea Robotics Society (한국로봇학회)

DOI QR Code

Segmentation-Based Depth Map Adjustment for Improved Grasping Pose Detection

물체 파지점 검출 향상을 위한 분할 기반 깊이 지도 조정

Hyunsoo Shin (Department of Electrical and Electronic Engineering, Hanyang University) ;
Muhammad Raheel Afzal (Flanders Make) ;
Sungon Lee (Department of Robotics, Hanyang University)

Received : 2023.11.28
Accepted : 2024.01.12
Published : 2024.02.29

https://doi.org/10.7746/jkros.2024.19.1.016 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Robotic grasping in unstructured environments poses a significant challenge, demanding precise estimation of gripping positions for diverse and unknown objects. Generative Grasping Convolution Neural Network (GG-CNN) can estimate the position and direction that can be gripped by a robot gripper for an unknown object based on a three-dimensional depth map. Since GG-CNN uses only a depth map as an input, the precision of the depth map is the most critical factor affecting the result. To address the challenge of depth map precision, we integrate the Segment Anything Model renowned for its robust zero-shot performance across various segmentation tasks. We adjust the components corresponding to the segmented areas in the depth map aligned through external calibration. The proposed method was validated on the Cornell dataset and SurgicalKit dataset. Quantitative analysis compared to existing methods showed a 49.8% improvement with the dataset including surgical instruments. The results highlight the practical importance of our approach, especially in scenarios involving thin and metallic objects.

Keywords

Acknowledgement

This research was supported by the MOTIE (Ministry of Trade, Industry, and Energy) in Korea, under the Fostering Global Talents for Innovative Growth Program (P0008745) supervised by the Korea Institute for Advancement of Technology (KIAT)

References

Y. Xiang, T. Schmidt, V. Narayanan, and D. Fox, "Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes," arXiv:1711.00199, 2018, DOI: 10.15607/RSS.2018.XIV.019.
B. Tekin, S. N. Sinha, and P. Fua, "Real-time seamless single shot 6d object pose prediction," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, pp. 292-301, 2018, DOI: 10.1109/cvpr.2018.00038.
S. Peng, Y. Liu, Q. Huang, H. Bao, and X. Zhou, "Pvnet: Pixelwise voting network for 6dof pose estimation," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 4556-4565, 2019, DOI: 10.1109/cvpr.2019.00469.
H. Han, W. Wang, X. Han, and X. Yang, "6-DoF grasp pose estimation based on instance reconstruction," Intelligent Service Robotics, Nov., 2023, DOI: 10.1007/s11370-023-00489-z.
J. Mahler, J. Liang, S. Niyaz, M. Laskey, R. Doan, X. Liu, J. Ojea, and K. Goldberg, "Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics," Robotics: Science and Systems, 2017, DOI: 10.15607/RSS.2017.XIII.058.
D. Morrison, P. Corke, and J. Leitner, "Closing the loop for robotic grasping: A real-time, generative grasp synthesis approach," Robotics: Science and Systems, 2018, DOI: 10.15607/RSS.2018.XIV.021.
D. Morrison, P. Corke, and J. Leitner. "Learning robust, realtime, reactive robotic grasping," The International Journal of Robotics Research, vol. 39, no. 2-3, pp. 183-201, Jun., 2019, DOI: 10.1177/0278364919859066.
S. Dodge, and L. Karam, "Understanding how image quality affects deep neural networks," 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX), Lisbon, Portugal, pp. 1-6, 2016, DOI:10.1109/QoMEX.2016.7498955.
J. Park, H. Kim, Y. Tai, M. Brown, and I. Kweon, "High quality depth map upsampling for 3D-TOF cameras," 2011 International Conference on Computer Vision, Barcelona, Spain, pp. 1623-1630, 2011, DOI: 10.1109/ICCV.2011.6126423.
S. Wang, X. Jiang, J. Zhao, X. Wang, W. Zhou, and Y. Liu. "Efficient fully convolution neural network for generating pixel wise robotic grasps with high resolution images," 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), Dali, China, pp. 474-480, 2019, DOI: 10.1109/ROBIO49542.2019.8961711.
I. Lenz, H. Lee, and A. Saxena, "Deep learning for detecting robotic grasps," The International Journal of Robotics Research, vol. 34, no.4-5, pp. 705-724, Mar., 2015, DOI: 10.1177/0278364914549607.
L. Pinto and A. Gupta. "Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours," 2016 IEEE International Conference on Robotics and Automation (ICRA), Stokholm, Sweden, pp. 3406-3413, 2016, DOI: 10.1109/ICRA.2016.7487517.
J. Redmon and A. Angelova, "Real-Time Grasp Detection Using Convolutional Neural Networks," 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, pp. 1316-1322, 2015, DOI: 10.1109/ICRA.2015.7139361.
Z. Wang, Z. Li, B. Wang, and H. Liu, "Robot grasp detection using multimodal deep convolutional neural networks," Advances in Mechanical Engineering, vol.8, Sept., 2016, DOI: 10.1177/1687814016668077.
F. Yu and V. Koltun. "Multi-scale context aggregation by dilated convolutions," arXiv:1511.07122, 2016, DOI: 10.48550/arXiv.1511.07122.
A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu. "Shapenet: An information-rich 3d model repository," arXiv:1512.03012, 2015, DOI: 10.48550/arXiv.1512.03012.
A. Depierre, E. Dellandrea, and L. Chen. "Jacquard: A large scale dataset for robotic grasp detection," 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, pp. 3511-3516, 2018, DOI: 10.1109/IROS.2018.8593950.
K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask r-cnn." International Conference on Computer Vision (ICCV), Venice, Italy, pp. 2980-2988, 2017, DOI: 10.1109/ICCV.2017.322.
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. Berg, W. Lo, P. Dollar, and R. B. Girshick, "Segment anything," arXiv:2304.02643, pp. 4015-4026, Apr., 2023, DOI: 10.48550/arXiv.2304.02643.
J. Kim, O. Nocentini, M. Bashir, and F. Cavallo, "Grasping Complex-Shaped and Thin Objects Using a Generative Grasping Convolutional Neural Network," Robotics, vol. 12, no. 2, Mar., 2023, DOI: 10.3390/robotics12020041.

The Journal of Korea Robotics Society (로봇학회논문지)

Segmentation-Based Depth Map Adjustment for Improved Grasping Pose Detection

물체 파지점 검출 향상을 위한 분할 기반 깊이 지도 조정

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)