DOI QR코드

DOI QR Code

Developing and Evaluating Damage Information Classifier of High Impact Weather by Using News Big Data

재해기상 언론기사 빅데이터를 활용한 피해정보 자동 분류기 개발

  • Su-Ji, Cho (School of Business Administration, Dankook University) ;
  • Ki-Kwang Lee (School of Business Administration, Dankook University)
  • Received : 2023.09.06
  • Accepted : 2023.09.15
  • Published : 2023.09.30

Abstract

Recently, the importance of impact-based forecasting has increased along with the socio-economic impact of severe weather have emerged. As news articles contain unconstructed information closely related to the people's life, this study developed and evaluated a binary classification algorithm about snowfall damage information by using media articles text mining. We collected news articles during 2009 to 2021 which containing 'heavy snow' in its body context and labelled whether each article correspond to specific damage fields such as car accident. To develop a classifier, we proposed a probability-based classifier based on the ratio of the two conditional probabilities, which is defined as I/O Ratio in this study. During the construction process, we also adopted the n-gram approach to consider contextual meaning of each keyword. The accuracy of the classifier was 75%, supporting the possibility of application of news big data to the impact-based forecasting. We expect the performance of the classifier will be improve in the further research as the various training data is accumulated. The result of this study can be readily expanded by applying the same methodology to other disasters in the future. Furthermore, the result of this study can reduce social and economic damage of high impact weather by supporting the establishment of an integrated meteorological decision support system.

Keywords

References

  1. Choe, H.S. and Kim, S.J., How to utilize big data in the public sector, It's Smart Media, 2013, Vol. 2, No. 3, pp. 18-25.
  2. Choi, S.H., Kim, J.Y., and Lee, J.K., A new solution to disaster management, Big data, Meteorological Technology & Policy, 2013, Vol. 6, No. 2, pp. 77-87.
  3. Chung, K.Y., Visions and directions for the Impact-based Forecasting, Meteorological Technology & Policy, 2016, Vol. 9, No. 1, pp. 6-22.
  4. Cox, D.R. The regression analysis of binary sequences, Journal of the Royal Statistical Society: Series B (Methodological), 1958, Vol. 20, No. 2, pp. 215-232. https://doi.org/10.1111/j.2517-6161.1958.tb00292.x
  5. Gupta V. and Lehal G.S., A Survey of Text Mining Techniques and Applications, Journal of Emerging Technologies in Web Intelligence, 2009, Vol. 1, pp. 60-76. https://doi.org/10.4304/jetwi.1.1.60-76
  6. Han, S.R., Kang, N.R., and Lee, C.S., Disaster Risk Evaluation for Urban Areas Under Composite Hazard Factors, Journal of Korean Society of Hazard Mitigation, 2015, Vol. 15, No. 3, pp. 33-43. https://doi.org/10.9798/KOSHAM.2015.15.3.33
  7. Ihm, H., Jang, K., Lee, K., Jang, G., Seo, M.G., Han, K., and Myaneng, S.H., Multi-source food hazard event extraction for public health, Proceedings of 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), February 13-16, Jeju, South Korea, 2017, pp. 414-417.
  8. Jung, J.I., Lee, K.J., and Kim, S.B., Text Mining and Network Analysis of News Articles for Deriving Socio-Economic Damage Types of Heat Wave Events in Korea: 2012~2016 Cases, Atmosphere, 2020, Vol. 30, No. 3, pp. 237-248. https://doi.org/10.14191/ATMOS.2020.30.3.237
  9. Kang, E.S. and Ko, D.S., Automatic Classification Model of Electronic Documents Based on Machine Learning for Job Analysis, The Journal of Korean Institute of Information Technology, 2019, Vol. 17, No. 7, pp. 23-29. https://doi.org/10.14801/jkiit.2019.17.7.23
  10. Kim, Y.W., Kim, B.H., Ko, G.S., Choi, M.W., Song, H.S., Kim, G.H., Yoo, S.H., Lim, J.T., Bok, K.S., and Yoo, J.S., Design and Implementation of a Flood Disaster Safety System Using Realtime Weather Big Data, The Journal of the Korea Contents Association, 2017, Vol. 17, No. 1, pp. 351-362. https://doi.org/10.5392/JKCA.2017.17.01.351
  11. Kohavi, R., A study of cross-validation and bootstrap for accuracy estimation and model selection, In International Joint Conference on Artificial Intelligence, 1995, Vol. 14, No. 2, pp. 1137-1145.
  12. Korea Environment Institute, Building and evaluating climate change adaptation capacity for national risk management: Analysis of direct and indirect effects of heat waves and cold waves based on data, 2019, Research paper.
  13. Korea Meteorological Administration, Planning research on integrated data construction methods for forecasting the impact of meteorological disasters, 2016, Researh paper
  14. Kwon, S.H., Anomaly Detection of Big Time Series Data Using Machine Learning, Journal of Society of Korea Industrial and Systems Engineering, 2020, Vol. 43, No. 2, pp. 33-38. https://doi.org/10.11627/jkise.2020.43.2.033
  15. Kwon, S.J., Kang, Y.H., Lee, Y.H., Lee, M.H., Park, S.H., and Kang, M.J., Analysis of Disaster Safety Situation Classification Algorithm Based on Natural Language Processing Using 119 Calls Data, KIPS Transactions on Software and Data Engineering, 2020, Vol. 9, No. 10, pp. 317-322.
  16. Lee, K.K., Measures to improve reliabiligy of weather forecasts based on big data analysis, Meteorological Technology & Policy, 2013, Vol. 6, No. 2, pp. 32-46.
  17. Lee, K.K., Public Satisfaction Analysis of Weather Forecast Service by Using Twitter, Journal of Society of Korea Industrial and Systems Engineering, 2018, Vol. 41, No. 2, pp. 9-15. https://doi.org/10.11627/jkise.2018.41.2.009
  18. Lee, K.K. and Kim, T.H., A Business Application of the Business Intelligence and the Big Data Analytics, Journal of Society of Korea Industrial and Systems Engineering, 2019, Vol. 42, No. 4, pp. 84-90. https://doi.org/10.11627/jkise.2019.42.4.084
  19. Lee, K.K., Shim, J.K., and Cho, S.J., Estimation of Risk Levels of Impact Forecast for Heavy Snow Event by Using Big Data of Media Articles, The e-Business Studies, 2022, Vol. 23, No. 1, pp. 233-245. https://doi.org/10.20462/tebs.2022.2.23.1.233
  20. National Disaster Management Research Institute, on the use of disaster weather information using Big data, 2019, Research paper
  21. National Institute of Meteorological Sciences, Research on the use of disaster weather information using Big data, 2019, Research paper
  22. Noh, B.J., Xu, Z.S., Lee, J.U., Chung, Y.W. and Park, D.H., Trend analysis of foot-and-mouth disease using keyword network, in Proceedings of Conference on Korean Society for Internet Information, 2016, Vol. 17, No. 1, pp. 217-218.
  23. Oh, H.J., An, S.K., and Kim, Y., Social Issue Risk Type Classification based on Social Bigdata, The Journal of the Korea Contents Association, 2016, Vol. 16, No. 8, pp. 1-9. https://doi.org/10.5392/JKCA.2016.16.08.001
  24. Palmer, T.N., The economic value of ensemble forecasts as a tool for risk assessment: From days to decades, Quarterly Journal of the Royal Meteorological Society: A Journal of the Atmospheric Sciences, Applied Meteorology and Physical Oceanography, 2002, Vol. 128, No. 581, pp. 747-774. https://doi.org/10.1256/0035900021643593
  25. Park, J.C., Han K.J., and Chae, Y.R. Correlation Analysis between Livestock Mortality Caused by Heat Wave and News Big Data, Journal of the Association of Korean Geographers, 2019, Vol. 8, No. 3, pp. 529-543. https://doi.org/10.25202/JAKG.8.3.13
  26. Seo, J.H. and Kim, Y.J., Assessing likelihood of drought impact occurrence in South Korea through Machine Learning, Proceedings of the Korea Water Resources Association Conference, 2021, pp.77-77.
  27. Shin, D.H. and Kim, Y.M., The utilization of Big Data's disaster management in Korea, Journal of the Korea Contents Association, 2015, Vol. 15, No. 2, pp. 377-392. https://doi.org/10.5392/JKCA.2015.15.02.377
  28. Tim K., 80 Percent of Your Data Will Be Unstructured in Five Years, Data Management Solutions Review, March 28, 2019. Accessed August 24, 2023. available at: https://solutionsreview.com/data-management/80-percent-of-your-data-will-be-unstructured-in-five-years/
  29. Vapnik, V., and Chervonenkis, A.Y., A class of algorithms for pattern recognition learning, Avtomat. i Telemekh, 1964, Vol. 25, No. 6, pp. 937-945.
  30. WMO, Post-Typhoon Haiyan (Yolanda) Expert Mission to the Philippines, Manila and Tacloban, 7-12 April 2014, Mission Report.
  31. Yeh, S.W., Suggestions for expanding impact forecasting services, Meteorological Technology & Policy, 2017, Vol. 10, No. 1, pp. 6-17.