DOI QR코드

DOI QR Code

Falling Accidents Analysis in Construction Sites by Using Topic Modeling

토픽 모델링을 이용한 건설현장 추락재해 분석

  • Ryu, Hanguk (Department of Architecture, Sahm Yook University)
  • Received : 2019.05.13
  • Accepted : 2019.07.20
  • Published : 2019.07.28

Abstract

We classify topics on fall incidents occurring in construction sites using topic modeling among machine learning techniques and analyze the causes of the accidents according to each topic. In order to apply topic modeling based on latent dirichlet allocation, text data was preprocessed and evaluated with Perplexity score to improve the reliability of the model. The most common falling accidents happened to the daily workers belonging to small construction site. Most of the causes were not operated properly due to lack of safety equipment, inadequacy of arrangement and wearing, and low performance of safety equipment. In order to prevent and reduce the falling accidents, it is important to educate the daily workers of small construction site, arrange the workplace, and check the wearing of personal safety equipment and device.

본 연구는 기계학습 기법 중 토픽 모델링을 활용하여 건설현장에서 발생하는 추락재해에 대한 토픽을 분류하고 각 토픽에 따른 재해요인을 분석하였다. 잠재 디리클레 할당 기반의 토픽 모델링을 적용하기 위해 텍스트 데이터의 전처리를 하였고 Perplexity 점수로 평가하여 모형의 신뢰성을 높였다. 각 토픽에서 공통으로 도출된 추락재해의 대부분은 소규모 사업장에 속한 일용직 작업자들에게 발생하였다. 추락재해의 대부분의 원인은 안전장비 미착용, 현장 정리 정돈 미흡, 안전장비의 성능 및 착용 상태로 인해 제대로 작동하지 않은 것으로 판단되었다. 추락재해를 예방하고 절감하기 위해서는 소규모 사업장에 맞는 안전교육과 작업장의 정리 정돈과 개인 안전장비의 적절한 착용 상태 및 성능을 확인하는 것이 중요한 것으로 도출되었다.

Keywords

OHHGBW_2019_v10n7_175_f0001.png 이미지

Fig. 1. Research process using topic modeling

OHHGBW_2019_v10n7_175_f0002.png 이미지

Fig. 2. LDA(latent Dirichlet allocation) Model

OHHGBW_2019_v10n7_175_f0003.png 이미지

Fig. 3. Frequent of factors

OHHGBW_2019_v10n7_175_f0004.png 이미지

Fig. 4. Choosing optimal number of topics

Table 1. Summary of frequent factors

OHHGBW_2019_v10n7_175_t0001.png 이미지

Table 2. Topic and documents number by topic modeling

OHHGBW_2019_v10n7_175_t0002.png 이미지

Table 3. Factors of three topics by topic modeling

OHHGBW_2019_v10n7_175_t0003.png 이미지

References

  1. J. H. Jo. (2012). A Study on the Causes Analysis and Preventive Measures by Disaster Types in Construction Fields. Journal of the Korea safety management & science, 14(1), 7-13. https://doi.org/10.12812/ksms.2012.14.1.007
  2. Ministry of Employment and Labor. (2017). Analysis of Industrial Accident Status.
  3. Korea Occupational Safety and Health Research Institute (KOSHA). (2014). Safety and Health Research Trends.
  4. Ministry of Employment and Labor. (2016). Korea Occupational Safety & Health Agency Evaluation Report for Prevention of Accident Prevention.
  5. S. W. Paik, H. J. Kim & D. H. Choi. (2012). A Study of Decreasing Critical Disastrous Accident in Small Construction Sites. Journal of the Korean Society of Agricultural Engineers, 54(6), 121-131. DOI: 10.5389/KSAE.2012.54.6.121
  6. M. Steyvers & T. Griffiths. (2007). Probabilistic topic models. Handbook of latent semantic analysis, 427(7), 424-440.
  7. Y. A. Hur, D. Y. Lee, K. K. Kim, W. H. Yu & H. S. Lim. (2017). A System for Automatic Classification of Traditional Culture Texts. Journal of the Korea Convergence Society, 8(12), 39-47. DOI: 10.15207/JKCS.2017.8.12.039
  8. R. Parimi & D. Caragea. (2011). Predicting friendship links in social networks using a topic modeling approach. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. (pp. 75-86). Springer, Berlin, Heidelberg.
  9. P. DiMaggio, M. Nag & D. Blei. (2013). Exploiting affinities between topic modeling and the sociological perspective on culture: Application to newspaper coverage of US government arts funding. Poetics, 41(6), 570-606. https://doi.org/10.1016/j.poetic.2013.08.004
  10. W. Nie, X. Wang, Y. L. Zhao, Y. Gao, Y. Su & T. S. Chua. (2013). Venue semantics: Multimedia topic modeling of social media contents. In Pacific-Rim Conference on Multimedia. (pp. 574-585). Springer : Cham.
  11. J. H. Bae, N. G. Han & M. Song. (2014). Twitter issue tracking system by topic modeling techniques. Journal of intelligence and information systems, 20(2), 109-122. https://doi.org/10.13088/jiis.2014.20.2.109
  12. S. T. Na, J. H. Kim, M. H. Jung & J. E. Ahn. (2016). Trend Analysis using Topic Modeling for Simulation Studies. Journal of the Korea Society for Simulation, 25(3), 107-116. DOI: 10.9709/JKSS.2016.25.3.107
  13. S. G. Lee. (2018). A Study on the Trends of Construction Safety Accident in Unstructured Text using Topic Modeling. Journal of the Korea Academia-Industrial Cooperation Society, 19(10), 176-182. https://doi.org/10.5762/KAIS.2018.19.10.176
  14. Y. H. Kim & Y. S. Kim. (2019). Trend Analysis of Healthcare Research in Korea using Topic Modeling. Journal of Wellness, 14(1), 253-262. DOI: 10.21097/ksw.2019.02.14.1.253
  15. N. K. Jang & M. J. Kim. (2017). Research Trend Analysis in Fashion Design Studies in Korea using Topic Modeling. Journal of Digital Convergence, 15(6), 415-423. DOI:htts://doi.org/10.14400/JDC.2017.15.16.415
  16. J. Y. Yang. (2019). Convergence Study on Research Topics for Thyroid Cancer in Korea. Journal of the Korea Convergence Society, 10(2), 75-81. DOI: 10.15207/JKCS.2019.10.2.075
  17. D. M. Blei, A. Y. Ng, & M. I. Jordan. (2003). Latent Dirichlet Allocation. Journal of machine Learning research, 3, 993-1022.
  18. D. Mimno & A. McCallum. (2008). Topic Models Conditioned on Arbitrary Features with Dirichlet-Multinomial Regression. The 24th Conference on Uncertainly in Artificial Intelligence. (pp. 411-418).
  19. M. Hoffman, F. R. Bach & D. M. Blei (2010). Online learning for latent Dirichlet allocation. In advances in neural information processing systems. 856-864.
  20. D. Newman, J. H. Lau, K. Grieser & T. Baldwin. (2010). Automatic Evaluation of Topic Coherence. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL. (pp. 100-108).