Trajectory monitoring of inland waterway vessels across multiple cameras based on improved one-stage CNN and inverse projection

Yitian Han;Dongming Feng;Ye Xia;Rong Lin;Chan Ghee Koh;Gang Wu;

doi:10.12989/sss.2024.34.3.157

Smart Structures and Systems

Volume 34 Issue 3
/
Pages.157-169
/
2024
/
1738-1584(pISSN)
/
1738-1991(eISSN)

Techno-Press (테크노프레스)

DOI QR Code

Trajectory monitoring of inland waterway vessels across multiple cameras based on improved one-stage CNN and inverse projection

Yitian Han (National and Local Joint Engineering Research Center for Intelligent Construction and Maintenance, Southeast University) ;
Dongming Feng (National and Local Joint Engineering Research Center for Intelligent Construction and Maintenance, Southeast University) ;
Ye Xia (School of Civil Engineering, Tongji University) ;
Rong Lin (National and Local Joint Engineering Research Center for Intelligent Construction and Maintenance, Southeast University) ;
Chan Ghee Koh (Department of Civil and Environmental Engineering, National University of Singapore) ;
Gang Wu (National and Local Joint Engineering Research Center for Intelligent Construction and Maintenance, Southeast University)

Received : 2023.10.13
Accepted : 2024.09.30
Published : 2024.09.25

https://doi.org/10.12989/sss.2024.34.3.157 Citation

⟨ Previous Next ⟩

Abstract

Accidents involving inland waterway vessels have raised concerns regarding monitoring their navigation tracks. The economical and convenient deployment of video surveillance equipment and computer vision techniques offer an effective solution for tracking vessel trajectories in narrow inland waterways. However, field applications of video surveillance systems face challenges of small object detection and the limited field of view of cameras. This paper investigates the feasibility of using multiple monocular cameras to monitor long-distance inland vessel trajectories. The one-stage CNN model, YOLOv5, is enhanced for small object detection by incorporating generalized intersection over union loss and a multi-scale fusion attention mechanism. The Bytetrack algorithm is employed to track each detected vessel, ensuring clear distinction in multiple-vessel scenarios. An inverse projection formula is derived and applied to the tracking results from monocular camera videos to estimate vessel world coordinates under potential water level changes in long-term monitoring. Experimental results demonstrate the effectiveness of the improved detection and tracking methods, with consistent trajectory matching for the same vessel across multiple cameras. Utilizing the Savitzky-Golay filter mitigates jitter in the entire final trajectory after timing-alignment merging, leading to a better fit of the dispersed trajectory points.

Keywords

Acknowledgement

The authors would like to acknowledge the committee of the 3rd International Competition for Structural Health Monitoring (IC-SHM 2022) for organization and data sharing. This research was funded by the National Natural Science Foundation of China (52127813) and the Fundamental Research Funds for the Central Universities (2242023K5006).

References

Bochkovskiy, A., Wang, C.-Y. and Liao, H.-Y.M. (2020), "YOLOv4: Optimal speed and accuracy of object detection", arXiv, 2004.10934. https://doi.org/10.48550/arXiv.2004.10934
Chabot, F., Chaouch, M., Rabarisoa, J., Teuliere, C. and Chateau, T. (2017), "Deep MANTA: A coarse-to-fine many-task network for joint 2D and 3D vehicle analysis from monocular image", Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1827-1836. https://doi.org/10.1109/CVPR.2017.198
Clausse, A., Benslimane, S. and de La Fortelle, A. (2019), "Large-scale extraction of accurate vehicle trajectories for driving behavior learning", Proceedings of the 30th IEEE Intelligent Vehicles Symposium (IV19), pp. 2391-2396. https://doi.org/10.1109/IVS.2019.8814095
Ge, Z., Liu, S., Wang, F., Li, Z. and Sun, J. (2021), "YOLOX: Exceeding YOLO series in 2021", arXiv, 2107.08430. https://doi.org/10.48550/arXiv.2107.08430
Gevorgyan, Z. (2022), "SIoU loss: More powerful learning for bounding box regression".
Girshick, R. (2015), "Fast R-CNN", Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1440-1448. https://doi.org/10.1109/ICCV.2015.169
Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014), "Rich feature hierarchies for accurate object detection and semantic segmentation", Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580-587. https://doi.org/10.1109/CVPR.2014.81
He, K.M., Zhang, X.Y., Ren, S.Q. and Sun, J. (2014), "Spatial pyramid pooling in deep convolutional networks for visual recognition", Proceedings of the 13th European Conference on Computer Vision (ECCV), Zurich, Switzerland, September. https://doi.org/10.1007/978-3-319-10578-9_23
Kong, W. and Hu, T. (2019), "A deep neural network method for detection and tracking ship for unmanned surface vehicle", Proceedings of 2019 IEEE 8th Data Driven Control and Learning Systems Conference (DDCLS), May. https://doi.org/10.1109/DDCLS.2019.8908899
Lee, W.J., Roh, M.I., Lee, H.W., Ha, J., Cho, Y.M., Lee, S.J. and Son, N.S. (2021), "Detection and tracking for the awareness of surroundings of a ship based on deep learning", J. Comput. Des. Eng., 8(5), 1407-1430. https://doi.org/10.1093/jcde/qwab053
Li, S.L., Guo, Y.P., Xu, Y. and Li, Z.L. (2019), "Real-time geometry identification of moving ships by computer vision techniques in bridge area", Smart. Struct. Syst., Int. J., 23(4), 359-371. https://doi.org/10.12989/sss.2019.23.4.359
Li, G., Lei, Y., Si, L. and Zheng, C. (2021), "Self-supervised visual representation learning for fine-grained ship detection", Proceedings of 2021 IEEE 4th International Conference on Information Systems and Computer Aided Education (ICISCAE), September. https://doi.org/10.1109/ICISCAE52414.2021.9590709
Liu, S., Qi, L., Qin, H.F., Shi, J.P. and Jia, J.Y. (2018), "Path aggregation network for instance segmentation", Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8759-8768. https://doi.org/10.1109/CVPR.2018.00913
Liu, J., Shi, G. and Zhu, K. (2019), "Vessel trajectory prediction model based on AIS sensor data and adaptive chaos differential evolution support vector regression (ACDE-SVR)", Appl. Sci., 9(15), 2983. https://doi.org/10.3390/app9152983
Mahaur, B. and Mishra, K.K. (2023), "Small-object detection based on YOLOv5 in autonomous driving systems", Pattern Recogn. Lett., 168, 115-122. https://doi.org/10.1016/j.patrec.2023.03.009
Omrani, E., Mousazadeh, H., Omid, M., Masouleh, M.T., Jafarbiglu, H., Salmani-Zakaria, Y., Makhsoos, A., Monhaseri, F. and Kiape, A. (2020), "Dynamic and static object detection and tracking in an autonomous surface vehicle", Ships Offshore Struct., 15(7), 711-721. https://doi.org/10.1080/17445302.2019.1668642
Raj, J.A., Idicula, S.M. and Paul, B. (2022), "Lightweight SAR Ship detection and 16 Class Classification using Novel Deep Learning Algorithm with a Hybrid Preprocessing Technique", Int. J. Remote Sens., 43(15-16), 5820-5847. https://doi.org/10.1080/01431161.2021.2008544
Redmon, J. and Farhadi, A. (2017), "YOLO9000: Better, Faster, Stronger", Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July. https://doi.org/10.1109/CVPR.2017.690
Redmon, J. and Farhadi, A. (2018), "YOLOv3: An incremental improvement", arXiv, 1804.02767. https://doi.org/10.48550/arXiv.1804.02767
Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016), "You Only Look Once: Unified, Real-Time Object Detection", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2016.91
Ren, S.Q., He, K.M., Girshick, R. and Sun, J. (2017), "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", IEEE Trans. Pattern Anal. Mach. Intell., 39(6), 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I. and Savarese, S. (2019), "Generalized intersection over union: A metric and a loss for bounding box regression", Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June. https://doi.org/10.1109/CVPR.2019.00075
Savitzky, A. and Golay, M.J.E. (1964), "Smoothing + differentiation of data by simplified least squares procedures", Anal. Chem., 36(8), 1627-1639. https://doi.org/10.1021/ac60214a047
Szpak, Z.L. and Tapamo, J.R. (2011), "Maritime surveillance: Tracking ships inside a dynamic background using a fast level-set", Expert Syst. Appl., 38(6), 6669-6680. https://doi.org/doi.org/10.1016/j.eswa.2010.11.068
Tang, W., Mondal, T.G., Wu, R.T., Subedi, A. and Jahanshahi, M.R. (2023), "Deep learning-based post-disaster building inspection with channel-wise attention and semi-supervised learning", Smart. Struct. Syst., Int. J., 31(4), 365-381. https://doi.org/10.12989/sss.2023.31.4.365
The 3rd International Competition for Structural Health Monitoring (IC-SHM 2022). https://shmc.tongji.edu.cn/ICSHM2022/main.htm
Thombre, S., Zhao, Z., Ramm-Schmidt, H., Garcia, J.M.V., Malkamaki, T., Nikolskiy, S., Hammarberg, T., Nuortie, H., Bhuiyan, M.Z.H., Sarkka, S. and Lehtola, V.V. (2022), "Sensors and AI techniques for situational awareness in autonomous ships: a review", IEEE Trans. Intell. Transp. Syst., 23(1), 64-83. https://doi.org/10.1109/TITS.2020.3023957
Tong, Z., Chen, Y., Xu, Z. and Yu, R. (2023), "Wise-IoU: bounding box regression loss with dynamic focusing mechanism", arXiv preprint arXiv:2301.10051.
Ultralytics (2020), YOLOv5: Open source neural networks in Python, accessed 9 June 2020. https://github.com/ultralytics/yolov5/
Valsamis, A., Tserpes, K., Zissis, D., Anagnostopoulos, D. and Varvarigou, T. (2017), "Employing traditional machine learning algorithms for big data streams analysis: The case of object trajectory prediction", J. Syst. Softw., 127, 249-257. https://doi.org/10.1016/j.jss.2016.06.016
Yang, L., Zhang, R.-Y., Li, L. and Xie, X. (2021), "SimAM: A simple, parameter-free attention module for convolutional neural networks", Proceedings of the 38th International Conference on Machine Learning, 139, 11863-11874.
Yoon, H., Shin, J. and Spencer Jr., B.F. (2018), "Structural displacement measurement using an unmanned aerial system", 33(3), 183-192. https://doi.org/10.1111/mice.12338
Zhang, B. and Zhang, J. (2021), "A traffic surveillance system for obtaining comprehensive information of the passing vehicles based on instance segmentation", IEEE Trans. Intell. Transp. Syst., 22(11), 7040-7055. https://doi.org/10.1109/TITS.2020.3001154
Zhang, T., Jiang, L., Xiang, D., Ban, Y., Pei, L. and Xiong, H. (2019), "Ship detection from PolSAR imagery using the ambiguity removal polarimetric notch filter", ISPRS J. Photogramm. Remote Sens., 157, 41-58. https://doi.org/10.1016/j.isprsjprs.2019.08.009
Zhang, B., Xu, Z., Zhang, J. and Wu, G. (2022a), "A warning framework for avoiding vessel-bridge and vessel-vessel collisions based on generative adversarial and dual-task networks", Comput.-Aided Civil Infrastr. Eng., 37(5), 629-649. https://doi.org/doi.org/10.1111/mice.12757
Zhang, Y.F., Ren, W.Q., Zhang, Z., Jia, Z., Wang, L. and Tan, T.N. (2022b), "Focal and efficient IOU loss for accurate bounding box regression", Neurocomputing, 506, 146-157. https://doi.org/10.1016/j.neucom.2022.07.042
Zheng, Z.H., Wang, P., Liu, W., Li, J.Z., Ye, R.G. and Ren, D.W. (2020), "Distance-IoU loss: faster and better learning for bounding box regression", Proceedings of the AAAI Conference on Artificial Intelligence, 34(07), 12993-13000. https://doi.org/10.1609/aaai.v34i07.6999

Smart Structures and Systems

Trajectory monitoring of inland waterway vessels across multiple cameras based on improved one-stage CNN and inverse projection

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)