Marine Vessel Target Detection Algorithm Based On Improved YOLOv5

Chen Gao;Jiyong Xu;Ruixia Liu;

doi:10.3837/tiis.2024.10.008

KSII Transactions on Internet and Information Systems (TIIS)

제18권10호
/
Pages.2966-2983
/
2024
/
1976-7277(pISSN)
/
1976-7277(eISSN)

한국인터넷정보학회 (Korean Society for Internet Information)

DOI QR Code

Marine Vessel Target Detection Algorithm Based On Improved YOLOv5

Chen Gao (School of Mathematics and Statistics, Qilu University of Technology (Shandong Academy of Sciences)) ;
Jiyong Xu (School of Mathematics and Statistics, Qilu University of Technology (Shandong Academy of Sciences)) ;
Ruixia Liu (School of Mathematics and Statistics, Qilu University of Technology (Shandong Academy of Sciences))

투고 : 2024.05.07
심사 : 2024.09.23
발행 : 2024.10.31

https://doi.org/10.3837/tiis.2024.10.008 인용 PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Considering the intricate and ever-changing nature of the marine environment and the diverse range of sizes for targets involved in marine ship target recognition, which present challenges in detecting specific targets, a marine ship target detection algorithm has been developed based on an enhanced iteration of YOLOv5. Initially, the integration of dynamic snake convolution (DySnakeConv) into the feature extraction network and subsequent enhancement of the C3 module based on this integration were implemented. This integration enables dynamic adjustments based on the input image size, adaptive fusion of feature sequences, and resolution of accuracy and continuity issues during the recognition process. Subsequently, a novel hybrid encoder (FSI) was devised, utilizing target scale characteristics to enhance the extraction capability of multi-scale information, facilitating effective detection and recognition of objects within images. Finally, we selected the Shape-IOU bounding box loss function to mitigate fixed target frame issues and enhance target detection accuracy. Experimental evaluations were conducted utilizing the Infrared Maritime Ship dataset. The results demonstrated that our enhanced model achieved a prediction accuracy of 93.8% and an average precision (mAP) value of 93.89%, surpassing YOLOv8s by 1.2% and 1.8%, respectively. Moreover, there was an increase in recall rate by 2% compared to YOLOv8n while reducing parameters from 10,473,392 to 6,549,901 only. The computational load decreased by 6.3 GFLOps compared with YOLOV8n, resulting in better performance in ocean target detection and recognition.

키워드

과제정보

This work is supported by the Natural Science Foundation Innovation and Development Joint Fund Project of Shandong Province under Grant NO. ZR2023LZH009 and Key R & D Project of Shandong Province under Grant NO. 2020CXGC010501.

참고문헌

Aloysius, Neena, and M. Geetha, "A review on deep convolutional neural networks," in Proc. of 2017 international conference on communication and signal processing (ICCSP), pp.588-592, 2017.
Cai, Zhaowei, and Nuno Vasconcelos, "Cascade R-CNN: Delving Into High Quality Object Detection," in Proc. of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.6154-6162, 2018.
Lin, Tsung-Yi, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar, "Focal Loss for Dense Object Detection," in Proc. of 2017 IEEE International Conference on Computer Vision (ICCV), pp.2999-3007, 2017.
Tian, Zhi et al., "FCOS: Fully Convolutional One-Stage Object Detection," in Proc. of 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp.9626-9635, Oct. 2019.
Jiao, Licheng et al., "A Survey of Deep Learning-Based Object Detection," IEEE Access, vol.7, pp.128837-128868, 2019. https://doi.org/10.1109/ACCESS.2019.2939201
Xu, Shangliang et al., "PP-YOLOE: An evolved version of YOLO," arXiv preprint arXiv:2203.16250, 2022.
Redmon, Joseph, Santosh Divvala, Ross Girshick, Ali Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in Proc. of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.779-788, 2016.
Liu, Wei, Dragomir Anguelov, Dumitru Erhan et al., "SSD: Single Shot MultiBox Detector," in Proc. of Computer Vision-ECCV 2016: 14th European Conference, Lecture Notes in Computer Science, vol.9905, Springer, pp.21-37, 2016.
Redmon, Joseph, and Ali Farhadi, "YOLO9000: Better, Faster, Stronger," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.7263-7271, 2017.
Redmon, Joseph, and Ali Farhadi, "YOLOv3: An Incremental Improvement," arXiv preprint arXiv:1804.02767, 2018.
Bochkovskiy, Alexey, Chien-Yao Wang, and Hong-Yuan Mark Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv preprint arXiv:2004.10934, 2020.
Jocher, Glenn et al., ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation, Zenodo, 2022.
Chuyi, Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie et al., "YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications," arXiv preprint arXiv:2209.02976, 2022.
Wang, Chien-Yao, Alexey Bochkovskiy, and Hong-Yuan Mark Liao, "YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors," in Proc. of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.7464-7475, 2023.
Jocher, G., Chaurasia, A., & Qiu, J, Ultralytics YOLO (Version 8.0.0), 2023. [Computer software] https://github.com/ultralytics/ultralytics
Wang, Chien-Yao, I-Hau Yeh, Hong-Yuan Mark Liao, "YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information," arXiv preprint arXiv:2402.13616, 2024.
Wang, Ao, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding, "YOLOv10: Real-Time End-to-End Object Detection," arXiv preprint arXiv:2405.14458, 2024.
Girshick, Ross et al., "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in Proc. of 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.580-587, 2014.
He, Kaiming et al., "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," IEEE transactions on pattern analysis and machine intelligence, vol.37, no.9, pp.1904-1916, 2015. https://doi.org/10.1109/TPAMI.2015.2389824
Girshick, Ross, "Fast R-CNN," in Proc. of 2015 IEEE International Conference on Computer Vision (ICCV), pp.1440-1448, 2015.
Ren, Shaoqing, Kaiming He, Ross Girshick, Jian Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.39, no.6, pp.1137-1149, 2017. https://doi.org/10.1109/TPAMI.2016.2577031
Vaswani, Ashish, Noam Shazeer, Niki Parmar et al., "Attention Is All You Need," in Proc. of 31st Conference on Neural Information Processing Systems (NIPS 2017), 2017.
Lin, Tsung-Yi, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar, "Focal Loss for Dense Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.42, no.2, pp.318-327, 2020. https://doi.org/10.1109/TPAMI.2018.2858826
Carion, Nicolas, Francisco Massa, Gabriel Synnaeve et al., "End-to-End Object Detection with Transformers," in Proc. of 16th European Conference on Computer Vision - ECCV 2020, Lecture Notes in Computer Science, vol.12346, pp.213-229, Springer, 2020.
Chen, Qiang, Xiaokang Chen, Jian Wang et al., "Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment," in Proc. of 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp.6610-6619, 2023.
Liu, Shilong, Feng Li, Hao Zhang, Xiao Yang, Xianbiao Qi, Hang Su, Jun Zhu, and Lei Zhang, "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR," arXiv preprint arXiv:2201.12329, 2022.
Li, Feng et al., "DN-DETR: Accelerate DETR Training by Introducing Query DeNoising," in Proc. of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.13609-13617, 2022.
Sun, Peize et al., "Sparse R-CNN: End-to-End Object Detection with Learnable Proposals," in Proc. of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.14449-14458, 2021.
Huang, Yueming, and Guowu Yuan, "AD-DETR: DETR with asymmetrical relation and decoupled attention in crowded scenes," Mathematical Biosciences and Engineering, vol.20, no.8, pp.14158-14179, 2023. https://doi.org/10.3934/mbe.2023633
Zhao, Yian et al., "DETRs Beat YOLOs on Real-time Object Detection," arXiv preprint arXiv:2304.08069, 2023.
Zhang, Hao, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel Ni, and Heung-Yeung Shum, "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection," in Proc. of ICLR 2023, 2023.
Qi, Yaolei, Yuting He, Xiaoming Qi, Yuan Zhang, Guanyu Yang, "Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation," in Proc. of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.6070-6079, 2023.
Zhang, Hao, and Shuaijie Zhang, "Shape-IoU: More Accurate Metric considering Bounding Box Shape and Scale," arXiv preprint arXiv:2312.17663, 2023.
Lin, Tsung-Yi, Piotr Dollar, Ross Girshick et al., "Feature Pyramid Networks for Object Detection," in Proc. of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.936-944, 2017.
Ghiasi, Golnaz, Tsung-Yi Lin, and Quoc V. Le, "NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection," in Proc. of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.7036-7045, 2019.
Tan, Mingxing, Ruoming Pang, and Quoc V. Le, "Efficientdet: Scalable and Efficient Object Detection," in Proc. of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.10781-10790, 2020.
Liu, Songtao, Di Huang, Yunhong Wang, "Learning Spatial Fusion for Single-Shot Object Detection," arXiv preprint arXiv:1911.09516, 2019.
Zhao, Qijie, Tao Sheng, Yongtao Wang et al., "M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network," in Proc. of AAAI Conference on Artificial Intelligence, vol.33, no.01, pp.9259-9266, 2019.
Liu, Shu et al., "Path Aggregation Network for Instance Segmentation," in Proc. of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.8759-8768, 2018.
Li, Chunliu,Wang Shuigen, Yantai Arrow Optoelectronic Technology Co., Ltd. in Shandong Province, China. http://openai.raytrontek.com/apply/Sea_shipping.html/

KSII Transactions on Internet and Information Systems (TIIS)

Marine Vessel Target Detection Algorithm Based On Improved YOLOv5

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)