Sequence Labeling-based Multiple Causal Relations Extraction using Pre-trained Language Model for Maritime Accident Prevention

해양사고 예방을 위한 사전학습 언어모델의 순차적 레이블링 기반 복수 인과관계 추출

  • Ki-Yeong Moon (Logistics System Research Team of Korea Railroad Research Institute) ;
  • Do-Hyun Kim (Logistics System Research Team of Korea Railroad Research Institute) ;
  • Tae-Hoon Yang (Department of Data Science Engineering, INHA University) ;
  • Sang-Duck Lee (Logistics System Research Team of Korea Railroad Research Institute)
  • 문기영 (한국철도기술연구원 첨단물류시스템연구실) ;
  • 김도현 (한국철도기술연구원 첨단물류시스템연구실) ;
  • 양태훈 (인하대학교 데이터사이언스학과) ;
  • 이상덕 (한국철도기술연구원 첨단물류시스템연구실)
  • Received : 2023.09.04
  • Accepted : 2023.10.04
  • Published : 2023.10.31


Numerous studies have been conducted to analyze the causal relationships of maritime accidents using natural language processing techniques. However, when multiple causes and effects are associated with a single accident, the effectiveness of extracting these causal relations diminishes. To address this challenge, we compiled a dataset using verdicts from maritime accident cases in this study, analyzed their causal relations, and applied labeling considering the association information of various causes and effects. In addition, to validate the efficacy of our proposed methodology, we fine-tuned the KoELECTRA Korean language model. The results of our validation process demonstrated the ability of our approach to successfully extract multiple causal relationships from maritime accident cases.



This research was supported by Korea Institute of Marine Science & Technology Promotion (KIMST) funded by the Ministry of Oceans and Fisheries (1525013138)


  1. KMST, "Maritime Accidents Statistical Yearbook",, Retrieved on 08.31.2023.
  2. J. Y. Choi, "A Study on the Causes of Marine Accidents and Prevention of Marine Accidents in Vessels", Cultural Interaction Studies of Sea Port Cities, Vol. 25, pp. 337-359, 2021.
  3. M. Luo and S. H. Shin, "Half-century Research Developments in Maritime Accidents: Future Directions", Accident Analysis & Prevention, Vol. 123, pp. 448-460, 2019.
  4. Y. Zhang, X. Sun, J. Chen and C. Cheng, "Spatial Patterns and Characteristics of Global Maritime Accidents", Reliability Engineering & System Safety, Vol. 206, p. 107310, 2021.
  5. R. J. Bye and P. G. Almklov, "Normalization of maritime accident data using AIS", Marine Policy, Vol. 109, p. 103675, 2019.
  6. W. Qiao, Y. Liu, X. Ma and Y. Liu, "Human Factors Analysis for Maritime Accidents based on a Dynamic Fuzzy Bayesian Network", Risk Analysis, Vol. 40, No. 5, pp. 957-980, 2020.
  7. S. Tirunagari, "Data Mining of Causal Relations from Text: Analysing Maritime Accident Investigation Reports", arXiv preprint, arXiv:1507.02447, 2015.
  8. B. Navas de Maya, O. Arslan, E. Akyuz, R. E. Kurt, and O. Turan, "Application of Data-mining Techniques to Predict and Rank Maritime Non-conformities in Tanker Shipping Companies using Accident Inspection Reports", Ships and Offshore Structures, Vol. 17, No. 3, pp. 687-694, 2022.
  9. J. I. Single, J. Schmidt and J. Denecke, "Knowledge Acquisition from Chemical Accident Databases using an Ontology-based Method and Natural Language Processing", Safety Science, Vol. 129, p. 104747, 2020.
  10. G. Liu, M. Boyd, M. Yu, S. Z. Halim and N. Quddus, "Identifying Causality and Contributory Factors of Pipeline Incidents by Employing Natural Language Processing and Text Mining Techniques", Process Safety and Environmental Protection, Vol. 152, pp. 37-46, 2021.
  11. G. Perboli, M. Gajetti, S. Fedorov, and S. L. Giudice, "Natural Language Processing for the Identification of Human Factors in Aviation Accidents Causes: An Application to the SHEL Methodology", Expert Systems with Applications, Vol. 186, p. 115694, 2021.
  12. Y. J. Lee, J. H. Park and S. D. Lee, "Named Entity Recognition and Causal Relation Extraction Based on Pre-trained Language Model for Safety Accident Analysis", Journal of Korean Institute of Intelligent Systems, Vol. 33, No. 4, pp. 360-367, 2023.
  13. J. M. Hwang and S. W. Shin, "Correlational Structure Modelling for Fall Accident Risk Factors of Portable Ladders Using Co-occurrence Keyword Networks", J. Korean Soc. Saf., Vol. 36, No. 3, pp. 50-59, 2021.
  14. Y. G. Yoon, J. Y. Lee, and T. K. Oh, "Text mining-based Data Preprocessing and Accident Type Analysis for Construction Accident Analysis", J. Korean Soc. Saf., Vol. 37, No. 2, pp. 18-27, 2022.
  15. A. Akbik, D. Blythe and R. Vollgraf, "Contextual String Embeddings for Sequence Labeling", In Proceedings of the 27th International Conference on Computational Linguistics, pp. 1638-1649, 2018.
  16. A. Vaswani et al., "Attention is All You Need", Advances in Neural Information Processing Systems, Vol. 30, 2017.
  17. J. Devlin, M. W. Chang, K. Lee, & K. Toutanova, "Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding", arXiv Preprint arXiv:1810.04805, 2018
  18. Y. Liu et al., "Roberta: A Robustly Optimized Bert Pretraining Approach", arXiv preprint arXiv:1907.11692, 2019.
  19. K. Clark, M. T. Luong, Q. V. Le, & C. D. Manning, "Electra: Pre-training Text Encoders as Discriminators Rrather than Generators", arXiv Preprint arXiv:2003.10555, 2020.
  20. J. M. Jang, J. O. Min and H. S. Noh, "KorPatELECTRA : A Pre-trained Language Model for Korean Patent Literature to Improve Performance in the Field of Natural Language Processing(Korean Patent ELECTRA)", Journal of The Korea Society of Computer and Information, Vol. 27, No. 2, pp. 15-23, 2022.
  21. J. Park, "Koelectra: Pretrained Electra Model for Korean", GitHub Repository, 2020.
  22. KMST, "Web site for the Investigation and Judgement Information Portal of Maritime Causalities",, Retrieved on 08.31.2023.