Acknowledgement
This work is supported by the Natural Science Foundation Innovation and Development Joint Fund Project of Shandong Province under Grant NO. ZR2023LZH009 and Key R & D Project of Shandong Province under Grant NO. 2020CXGC010501.
References
- Aloysius, Neena, and M. Geetha, "A review on deep convolutional neural networks," in Proc. of 2017 international conference on communication and signal processing (ICCSP), pp.588-592, 2017.
- Cai, Zhaowei, and Nuno Vasconcelos, "Cascade R-CNN: Delving Into High Quality Object Detection," in Proc. of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.6154-6162, 2018.
- Lin, Tsung-Yi, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar, "Focal Loss for Dense Object Detection," in Proc. of 2017 IEEE International Conference on Computer Vision (ICCV), pp.2999-3007, 2017.
- Tian, Zhi et al., "FCOS: Fully Convolutional One-Stage Object Detection," in Proc. of 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp.9626-9635, Oct. 2019.
- Jiao, Licheng et al., "A Survey of Deep Learning-Based Object Detection," IEEE Access, vol.7, pp.128837-128868, 2019. https://doi.org/10.1109/ACCESS.2019.2939201
- Xu, Shangliang et al., "PP-YOLOE: An evolved version of YOLO," arXiv preprint arXiv:2203.16250, 2022.
- Redmon, Joseph, Santosh Divvala, Ross Girshick, Ali Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in Proc. of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.779-788, 2016.
- Liu, Wei, Dragomir Anguelov, Dumitru Erhan et al., "SSD: Single Shot MultiBox Detector," in Proc. of Computer Vision-ECCV 2016: 14th European Conference, Lecture Notes in Computer Science, vol.9905, Springer, pp.21-37, 2016.
- Redmon, Joseph, and Ali Farhadi, "YOLO9000: Better, Faster, Stronger," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.7263-7271, 2017.
- Redmon, Joseph, and Ali Farhadi, "YOLOv3: An Incremental Improvement," arXiv preprint arXiv:1804.02767, 2018.
- Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv preprint arXiv:2004.10934, 2020.
- Jocher, Glenn et al., ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation, Zenodo, 2022.
- Chuyi, Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie et al., "YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications," arXiv preprint arXiv:2209.02976, 2022.
- Wang, Chien-Yao, Alexey Bochkovskiy, and Hong-Yuan Mark Liao, "YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors," in Proc. of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.7464-7475, 2023.
- Jocher, G., Chaurasia, A., & Qiu, J, Ultralytics YOLO (Version 8.0.0), 2023. [Computer software] https://github.com/ultralytics/ultralytics
- Wang, Chien-Yao, I-Hau Yeh, Hong-Yuan Mark Liao, "YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information," arXiv preprint arXiv:2402.13616, 2024.
- Wang, Ao, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding, "YOLOv10: Real-Time End-to-End Object Detection," arXiv preprint arXiv:2405.14458, 2024.
- Girshick, Ross et al., "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in Proc. of 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.580-587, 2014.
- He, Kaiming et al., "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," IEEE transactions on pattern analysis and machine intelligence, vol.37, no.9, pp.1904-1916, 2015. https://doi.org/10.1109/TPAMI.2015.2389824
- Girshick, Ross, "Fast R-CNN," in Proc. of 2015 IEEE International Conference on Computer Vision (ICCV), pp.1440-1448, 2015.
- Ren, Shaoqing, Kaiming He, Ross Girshick, Jian Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.39, no.6, pp.1137-1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
- Vaswani, Ashish, Noam Shazeer, Niki Parmar et al., "Attention Is All You Need," in Proc. of 31st Conference on Neural Information Processing Systems (NIPS 2017), 2017.
- Lin, Tsung-Yi, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar, "Focal Loss for Dense Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.42, no.2, pp.318-327, 2020. https://doi.org/10.1109/TPAMI.2018.2858826
- Carion, Nicolas, Francisco Massa, Gabriel Synnaeve et al., "End-to-End Object Detection with Transformers," in Proc. of 16th European Conference on Computer Vision - ECCV 2020, Lecture Notes in Computer Science, vol.12346, pp.213-229, Springer, 2020.
- Chen, Qiang, Xiaokang Chen, Jian Wang et al., "Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment," in Proc. of 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp.6610-6619, 2023.
- Liu, Shilong, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, and Lei Zhang, "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR," arXiv preprint arXiv:2201.12329, 2022.
- Li, Feng et al., "DN-DETR: Accelerate DETR Training by Introducing Query DeNoising," in Proc. of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.13609-13617, 2022.
- Sun, Peize et al., "Sparse R-CNN: End-to-End Object Detection with Learnable Proposals," in Proc. of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.14449-14458, 2021.
- Huang, Yueming, and Guowu Yuan, "AD-DETR: DETR with asymmetrical relation and decoupled attention in crowded scenes," Mathematical Biosciences and Engineering, vol.20, no.8, pp.14158-14179, 2023. https://doi.org/10.3934/mbe.2023633
- Zhao, Yian et al., "DETRs Beat YOLOs on Real-time Object Detection," arXiv preprint arXiv:2304.08069, 2023.
- Zhang, Hao, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel Ni, and Heung-Yeung Shum, "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection," in Proc. of ICLR 2023, 2023.
- Qi, Yaolei, Yuting He, Xiaoming Qi, Yuan Zhang, Guanyu Yang, "Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation," in Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.6070-6079, 2023.
- Zhang, Hao, and Shuaijie Zhang, "Shape-IoU: More Accurate Metric considering Bounding Box Shape and Scale," arXiv preprint arXiv:2312.17663, 2023.
- Lin, Tsung-Yi, Piotr Dollar, Ross Girshick et al., "Feature Pyramid Networks for Object Detection," in Proc. of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.936-944, 2017.
- Ghiasi, Golnaz, Tsung-Yi Lin, and Quoc V. Le, "NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection," in Proc. of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.7036-7045, 2019.
- Tan, Mingxing, Ruoming Pang, and Quoc V. Le, "Efficientdet: Scalable and Efficient Object Detection," in Proc. of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.10781-10790, 2020.
- Liu, Songtao, Di Huang, Yunhong Wang, "Learning Spatial Fusion for Single-Shot Object Detection," arXiv preprint arXiv:1911.09516, 2019.
- Zhao, Qijie, Tao Sheng, Yongtao Wang et al., "M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network," in Proc. of AAAI Conference on Artificial Intelligence, vol.33, no.01, pp.9259-9266, 2019.
- Liu, Shu et al., "Path Aggregation Network for Instance Segmentation," in Proc. of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.8759-8768, 2018.
- Li, Chunliu,Wang Shuigen, Yantai Arrow Optoelectronic Technology Co., Ltd. in Shandong Province, China. http://openai.raytrontek.com/apply/Sea_shipping.html/