Browse > Article
http://dx.doi.org/10.12652/Ksce.2022.42.1.0127

Development of Machine Learning-based Construction Accident Prediction Model Using Structured and Unstructured Data of Construction Sites  

Cho, Mingeon (Sungkyunkwan University)
Lee, Donghwan (Sungkyunkwan University)
Park, Jooyoung (Sungkyunkwan University)
Park, Seunghee (Sungkyunkwan University)
Publication Information
KSCE Journal of Civil and Environmental Engineering Research / v.42, no.1, 2022 , pp. 127-134 More about this Journal
Abstract
Recently, policies and research to prevent increasing construction accidents have been actively conducted in the domestic construction industry. In previous studies, the prediction model developed to prevent construction accidents mainly used only structured data, so various characteristics of construction sites are not sufficiently considered. Therefore, in this study, we developed a machine learning-based construction accident prediction model that enables the characteristics of construction sites to be considered sufficiently by using both structured and text-type unstructured data. In this study, 6,826 cases of construction accident data were collected from the Construction Safety Management Integrated Information (CSI) for machine learning. The Decision forest algorithm and the BERT language model were used to train structured and unstructured data respectively. As a result of analysis using both types of data, it was confirmed that the prediction accuracy was 95.41 %, which is improved by about 20 % compared to the case of using only structured data. Conclusively, the performance of the predictive model was effectively improved by using the unstructured data together, and construction accidents can be expected to be reduced through more accurate prediction.
Keywords
Construction accident; Prediction model; Machine learning; BERT; Decision forest;
Citations & Related Records
Times Cited By KSCI : 8  (Citation Analysis)
연도 인용수 순위
1 Cho, Y. R., Kim, Y. C. and Shin, Y. S. (2017). "Prediction model of construction safety accidents using decision tree technique." Journal of the Korea Institute of Building Construction, Vol 17, No. 3, pp. 295-303 (in Korean).   DOI
2 Choi, S. J., Kim, J. H. and Jung, K. H. (2021). "Development of prediction models for fatal accidents using proactive information in construction sites." Journal of the Korean Society of Safety, Vol. 36, No. 3, pp. 31-39 (in Korean).   DOI
3 Choi, S. Y. (2020). Comparison analysis of deaths in construction industry in OECD countries, Construction & Economy Research Institute of Korea, pp. 13 (in Korean).
4 Cortes, C. and Vapnik, V. (1995). "Support-vector networks." Machine Learning, Vol. 20, pp. 273-297.   DOI
5 Devlin, J., Chang, M. W., Lee, K. and Toutanova, K. (2019). "BERT: Pre-training of deep bidirectional transformers for language understanding." arXiv:1810.04805v2, pp. 1-16.
6 Fisher, A., Rudin, C. and Dominici, F. (2019). "All models are wrong, but many are useful: learning a variable's importance by studying an entire class of prediction models simultaneously." arXiv:1801.01489v5, pp. 1-81.
7 Hoskins, J. C. and Himmelblau, D. M. (1992). "Process control via artificial neural networks and reinforcement learning." Computers & Chemical Engineering, Vol. 16, No. 4, pp. 241-251.   DOI
8 Kim, B. S. (2008). "The appropriation and the use scheme of safety control cost for reducing severity rate of injury on construction." Journal of the Korean Society of Civil Engineers, KSCE, Vol. 28, No. 3D, pp. 383-390 (in Korean).
9 Kim, Y. C., Yoo, W. S. and Shin, Y. S. (2017). "Application of artificial neural networks to prediction of construction safety accidents." Journal of the Korean Society of Hazard Mitigation, Vol. 17, No. 1, pp. 7-14 (in Korean).   DOI
10 Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. and Polosukhin, I. (2017). "Attention is all you need." arXiv:1706.03762v5, pp. 1-15.
11 Korea Labor Institute (KLI) (2013). Construction industry accident status analysis and policy direction, pp. 31 (in Korean).
12 Korea Occupational Safety and Health Agency (KOSHA) (2019). 2019 Large accident report book, pp. 9 (in Korean).
13 Lee, C. H., Lee, Y. J. and Lee, D. H. (2020). "A study of fine tuning pre-trained korean BERT for question answering performance development." Journal of Information Technology Services, Vol. 19, No. 5, pp. 83-91 (in Korean).   DOI
14 Lee, S. G. (2018). "A study on the trends of construction safety accident in unstructured text using topic modeling." Journal of the Korea Academia-Industrial Cooperation Society, Vol. 19, No. 10, pp. 176-182 (in Korean).
15 Lim, W. J., Kee, J. H., Seong, J. H. and Park, J. Y. (2019). "Development of accident cause analysis model for construction site." Journal of the Korean Society of Safety, Vol. 34, No. 1, pp. 45-52 (in Korean).   DOI
16 Ministry of Employment and Labor (MOEL) (2020). 2019 Industrial accident analysis of current situation, pp. 32 (in Korean).
17 Park, K. C. and Kim, H. K. (2021). "Analysis of seasonal importance of construction hazards using text mining." KSCE Journal of Civil and Environmental Engineering Research, KSCE, Vol. 41, No. 3, pp. 305-316 (in Korean).   DOI
18 Yu, Y. J., Kim, T. H., Son, K. Y., Lee, K. H. and Kim, J. M. (2016). "Analysis of primary internal and external risk factors according to the accident causes in construction site." Journal of the Korea Institute of Building Construction, Vol. 16, No. 6, pp. 519-527 (in Korean).   DOI
19 Zhang, F., Fleyeh, H., Wang, X. and Lu, M. (2019). "Construction site accident analysis using text mining and natural language processing techniques." Automation in Construction, Vol. 99, pp. 238-248.   DOI
20 Zhang, H. (2004). The optimality of naive bayes, American Association for Artificial Intelligence, USA, pp. 1-6.
21 Raschka, S. (2018). "Model evaluation, model selection, and algorithm selection in machine learning." arXiv:1801.01489v5, pp. 1-45.
22 Rokach, L. (2016). "Decision forest: Twenty years of research." Information Fusion, Vol. 27, pp. 111-125.   DOI
23 Sperandei, S. (2014). "Understanding logistic regression analysis." Biochemia Medica, Vol. 24, No. 1, pp. 12-18.   DOI
24 Cho, J. H. (2012). "A study on the causes analysis and preventive measures by disaster types in construction fields." Journal of the Korea Safety Management & Science, Vol. 14, No. 1, pp. 7-13.   DOI
25 Woo, D. C., Moon, H. S., Kwon, S. B. and Cho, Y. H. (2019). "A deep learning application for automated feature extraction in transaction-based machine learning." Journal of Information Technology Service, Vol. 18, No. 2, pp. 143-159.   DOI
26 Ha, M. S. and Ahn, H. C. (2019). "A machine learning-based vocational training dropout prediction model considering structured and unstructured data." Journal of the Korea Contents Association, Vol. 19, No. 1, pp. 1-15.   DOI
27 Shanker, M., Hu, M. Y. and Hung, M. S. (1996). "Effect of data standardization on neural network training." The International Journal of Management Science, Vol. 24, No. 4, pp. 385-397.
28 Beautiful Soup (2020). Beautiful soup documentation, Available at: https://www.crummy.com/software/BeautifulSoup/bs4/doc/ (Accessed: June 25, 2020).
29 Sokolova, M. and Lapalme, G. (2009). "A systematic analysis of performance measures for classification tasks." Information Processing and Management, Vol. 45, No. 4, pp. 427-437.   DOI