DOI QR코드

DOI QR Code

Robust 3D Object Detection through Distance based Adaptive Thresholding

거리 기반 적응형 임계값을 활용한 강건한 3차원 물체 탐지

  • Eunho Lee (Interdisciplinary Program in Artificial Intelligence, Seoul National University) ;
  • Minwoo Jung (Mechanical Engineering, Seoul National University) ;
  • Jongho Kim (Mechanical Engineering, Seoul National University) ;
  • Kyongsu Yi (Interdisciplinary Program in Artificial Intelligence, Seoul National University) ;
  • Ayoung Kim (Interdisciplinary Program in Artificial Intelligence, Seoul National University)
  • Received : 2023.10.30
  • Accepted : 2023.12.05
  • Published : 2024.02.29

Abstract

Ensuring robust 3D object detection is a core challenge for autonomous driving systems operating in urban environments. To tackle this issue, various 3D representation, including point cloud, voxels, and pillars, have been widely adopted, making use of LiDAR, Camera, and Radar sensors. These representations improved 3D object detection performance, but real-world urban scenarios with unexpected situations can still lead to numerous false positives, posing a challenge for robust 3D models. This paper presents a post-processing algorithm that dynamically adjusts object detection thresholds based on the distance from the ego-vehicle. While conventional perception algorithms typically employ a single threshold in post-processing, 3D models perform well in detecting nearby objects but may exhibit suboptimal performance for distant ones. The proposed algorithm tackles this issue by employing adaptive thresholds based on the distance from the ego-vehicle, minimizing false negatives and reducing false positives in the 3D model. The results show performance enhancements in the 3D model across a range of scenarios, encompassing not only typical urban road conditions but also scenarios involving adverse weather conditions.

Keywords

Acknowledgement

This work was supported by the Technology Innovation Program (or Industrial Strategic Technology Development Program - Mobility and Connectivity Platform for Digital Transformation Acceleration in Unmanned Delivery) (20024355, Development of autonomous driving connectivity technology based on sensor-infrastructure cooperation) funded By the Ministry of Trade, Industry & Energy (MOTIE, Korea)

References

  1. K. Zhang, S. Tong, H. Shi, G. Yue, and J. Zhao, "Moving object detection of assembly components based on improved background subtraction algorithm," IOP Conference Series: Materials Science and Engineering, vol. 1009, 2021, DOI: 10.1088/1757-899X/1009/1/012063. 
  2. S. F. Lin and S. H. Huang, "Moving object detection from a moving stereo camera via depth information and visual odometry," 2018 IEEE International Conference on Applied System Invention (ICASI), Chiba, Japan, pp. 437-440, 2018, DOI:10.1109/ICASI.2018.8394278. 
  3. Y. Zhou and O. Tuzel, "VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 4490-4499, 2018, DOI: 10.1109/CVPR.2018.00472. 
  4. G. Ross, "Fast r-cnn," IEEE international conference on computer vision (ICCV), Santiago, Chile, pp. 1440-1448, 2015, DOI: 10.48550/arXiv.1504.08083. 
  5. Y. Yan, Y. Mao, and B. Li, "Second: Sparsely embedded convolutional detection," Sensors, vol. 18, no. 10, Oct., 2018, DOI:10.3390/s18103337. 
  6. A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, "PointPillars: Fast Encoders for Object Detection From Point Clouds," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 12689-12697, 2019, DOI: 10.1109/CVPR.2019.01298. 
  7. S. Shi, X. Wang, and H. Li, "PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 770-779, 2019, DOI: 10.1109/CVPR.2019.00086. 
  8. S. Shi, C. Guo, J. Yang, and H. Li, "PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 10526-10535, 2020, DOI: 10.1109/CVPR42600.2020.01054. 
  9. K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, "CenterNet: Keypoint Triplets for Object Detection," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 6568-6577, 2019, DOI: 10.1109/ICCV.2019.00667. 
  10. H. Kim and S. Park, "Monocular Camera based Real-Time Object Detection and Distance Estimation Using Deep Learning," The Journal of Korea Robotics Society, vol. 14, no. 4, pp. 357-362, Nov., 2019, DOI: 10.7746/jkros.2019.14.4.357. 
  11. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 779-788, 2016, DOI: 10.1109/CVPR.2016.91. 
  12. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, pp. 580-587, 2014, DOI: 10.1109/CVPR.2014.81. 
  13. D. Song, J.-B. Yi, and S.-J. Yi, "Development of an Efficient 3D Object Recognition Algorithm for Robotic Grasping in Cluttered Environments," The Journal of Korea Robotics Society, vol. 17, no. 3, pp. 255-263, Aug., 2022, DOI: 10.7746/jkros.2022.17.3.255. 
  14. T. Wang, X. Zhu, J. Pang, and D. Lin, "FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection," 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, pp. 913-922, 2021, DOI: 10.1109/ICCVW54120.2021.00107. 
  15. Z. Tian, C. Shen, H. Chen, and T. He, "FCOS: Fully Convolutional One-Stage Object Detection," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 9626-9635, 2019, DOI: 10.1109/ICCV.2019.00972. 
  16. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, "End-to-end object detection with transformers," European conference on computer vision (ECCV), Glasgow, UK, pp. 213-229, 2020, DOI: 10.48550/arXiv.2005.12872. 
  17. Y. Wang, V. Guizilini, T. Zhang, Y. Wang, H. Zhao, and J. Solomon, "DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries," 2021 Conference on Robot Learning (CoRL), London, UK, DOI: 10.48550/arXiv.2110.06922. 
  18. Y. Liu, T. Wang, X. Zhang, and J. Sun, "Petr: Position embedding transformation for multi-view 3D object detection," European Conference on Computer Vision, Tel Aviv, Israel, pp. 531-548, 2022, DOI: 10.48550/arXiv.2203.05625. 
  19. Z. Chen, Z. Li, S. Zhang, L. Fang, Q. Jiang, and F. Zhao, "Graph-DETR3D: rethinking overlapping regions for multi-view 3D object detection," The 30th ACM International Conference on Multimedia, Lisboa, Portugal, pp. 5999-6008, 2022, DOI:10.1145/3503161.3547859. 
  20. X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, "Multi-view 3D Object Detection Network for Autonomous Driving," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 6526-6534, 2017, DOI: 10.1109/CVPR.2017.691. 
  21. C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, "Frustum PointNets for 3D Object Detection from RGB-D Data," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 918-927, 2018, DOI:10.1109/CVPR.2018.00102. 
  22. Z. Wang and K. Jia, "Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection," 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, pp. 1742-1749, 2019, DOI: 10.1109/IROS40897.2019.8968513. 
  23. R. Nabati and H. Qi, "CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection," 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, pp. 1526-1535, 2021, DOI: 10.1109/WACV48630.2021.00157. 
  24. X. Chen, T. Zhang, Y. Wang, Y. Wang, and H. Zhao, "FUTR3D: A Unified Sensor Fusion Framework for 3D Detection," 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, pp. 172-181, 2023, DOI: 10.1109/CVPRW59228.2023.00022. 
  25. X. Bai, Z. Hu, X. Zhu, Q. Huang, Y. Chen, H. Fu, and C. Tai, "TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 1080-1089, 2022, DOI: 10.1109/CVPR52688.2022.00116. 
  26. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," arXiv:1706.03762, 2017, DOI: 10.48550/arXiv.1706.03762. 
  27. Z. Liu, H. Tang, A. Amini, X. Yang, H. Mao, D. Rus, and S. Han, "BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation," 2023 IEEE International Conference on Robotics and Automation (ICRA), London, United Kingdom, pp. 2774-2781, 2023, DOI: 10.1109/ICRA48891.2023.10160968. 
  28. Y. Li, A. W. Yu, T. Meng, B. Caine, J. Ngiam, D. Peng, J. Shen, B. Wu, Y. Lu, D. Zhou, Q. V. Le, A. Yuille, and M. Tan, "DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 17161-17170, 2022, DOI: 10.1109/CVPR52688.2022.01667. 
  29. Y. Chen, Y. Li, X. Zhang, J. Sun, and J. Jia, "Focal Sparse Convolutional Networks for 3D Object Detection," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 5418-5427, 2022, DOI:10.1109/CVPR52688.2022.00535. 
  30. S. Xu, D. Zhou, J. Fang, J. Yin, Z. Bin, and L. Zhang, "Fusion Painting: Multimodal Fusion with Adaptive Attention for 3D Object Detection," 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, pp. 3047-3054, 2021, DOI: 10.1109/ITSC48978.2021.9564951. 
  31. M. Liang, B. Yang, S. Wang, and R. Urtasun, "Deep continuous fusion for multi-sensor 3D object detection," The European conference on computer vision (ECCV), Munich, Germany, pp. 641-656, 2018, DOI: 10.48550/arXiv.2012.10992. 
  32. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Fu, and A. C. Berg, "SSD: Single Shot MultiBox Detector," Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, pp. 21-37, 2016, DOI: 10.1007/978-3-319-46448-0_2. 
  33. A. Geiger, P. Lenz, and R. Urtasun, "Are we ready for autonomous driving? The KITTI vision benchmark suite," 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, pp. 3354-3361, 2012, DOI: 10.1109/CVPR.2012.6248074.