Acknowledgement
This research was supported by DNA+Drone Technology Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (No. NRF-2020M3C1C2A01080819).
References
- Y. Zhang, Z. Zhou, P. David, X. Yue, Z. Xi, and B. Gong, et al.,"Polarnet: An improved grid representation for online lidar point clouds semantic segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9601-9610.
- X. Zhu, H. Zhou, T. Wang, F. Hong, Y. Ma, and W. Li, et al., "Cylindrical and asymmetrical 3D convolution networks for lidar segmentation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 9939-9948.
- T. Yin, X. Zhou, and P. Krahenbuhl, "Center-based 3D object detection and tracking," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11784-11793.
- X. Zhu, Y. Ma, T. Wang, Y. Xu, J. Shi, and D. Lin, "Ssn: Shape signature networks for multi-class object detection from point clouds," in European Conference on Computer Vision, Springer, 2020, pp. 581-597.
- Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, and Q. Chu, "Geometry uncertainty projection network for monocular 3D object detection," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2020, pp. 3111-3121.
- X. Liu, N. Xue, and T. Wu, "Learning auxiliary monocular contexts helps monocular 3D object detection," arXiv preprint arXiv:2112.04628, 2021, unpublished.
- P. Li, X. Chen, and S. Shen, "Stereo r-cnn based 3D object detection for autonomous driving," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7644-7652.
- J. Sun, L. Chen, Y. Xie, S. Zhang, Q. Jiang, X. Zhou, and H. Bao, "Disp r-cnn: Stereo 3D object detection via shape prior guided instance disparity estimation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 10548-10557.
- Z. Li, W. Wang, H. Li, E. Xie, C. Sima, and T. Lu, et al., "BEVFormer: Learning bird's-eye-view representation from multi-camera images via spatiotemporal transformers," arXiv preprint arXiv:2203.17270, 2022, unpublished.
- Y. Liu, T. Wang, X. Zhang, and J. Sun, "Petr: Position embedding transformation for multi-view 3D object detection," arXiv preprint arXiv:2203.05625, 2022, unpublished.
- Y. Jiang, L. Zhang, Z. Miao, X. Zhu, J. Gao, and W. Hu et al., "PolarFormer: Multi-camera 3D object detection with polar transformers," arXiv preprint arXiv:2206. 15398, 2022, unpublished.
- E. Arnold, O. Y. Al-Jarrah, M. Dianati, S. Fallah, D. Oxtoby, and A. Mouzakitis, "A survey on 3d object detection methods for autonomous driving applications," IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 10, pp. 3782-3795, 2019. https://doi.org/10.1109/tits.2019.2892405
- Z. Li, Y. Du, M. Zhu, S. Zhou, and L. Zhang, "A survey of 3D object detection algorithms for intelligent vehicles development," Artificial Life and Robotics, pp. 1-8, 2021.
- C. Reading, A. Harakeh, J. Chae, and S. L. Waslander, "Categorical depth distribution network for monocular 3D object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8555-8564.
- Y. Chen, S. Liu, X. Shen, and J. Jia, "Dsgn: Deep stereo geometry network for 3D object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 12536-12545.
- A. Geiger, P. Lenz, and R. Urtasun, "Are we ready for autonomous driving? The kitti vision benchmark suite," in Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE. 2012, pp. 3354-3361.
- H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, and Q. Xu, et al., "nuscenes: A multimodal dataset for autonomous driving," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11621-11631.
- P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, and P. Tsui, "Scalability in perception for autonomous driving: Waymo open dataset," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2446-2454.
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko,"End-to-end object detection with transformers," in Proceedings of European Conference on Computer Vision, Springer, 2020, pp. 213-229.
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition, " in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778.
- Y. Lee, J. W. Hwang, S. Lee, Y. Bae, and J. Park, "An energy and gpu-computation efficient backbone network for real-time object detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.
- T. Y. Lin, P. Dollar, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2117-2125.
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, and Z. Zhang, et al., "Swin transformer: Hierarchical vision transformer using shifted windows," in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012-10022.