DOI QR코드

DOI QR Code

자연재해 분석을 위한 빅데이터 마이닝 기술

Big data mining for natural disaster analysis

  • Kim, Young-Min (Disaster Information Service Lab., Korea Institute of Science and Technology Information) ;
  • Hwang, Mi-Nyeong (Disaster Information Service Lab., Korea Institute of Science and Technology Information) ;
  • Kim, Taehong (Disaster Information Service Lab., Korea Institute of Science and Technology Information) ;
  • Jeong, Chang-Hoo (Disaster Information Service Lab., Korea Institute of Science and Technology Information) ;
  • Jeong, Do-Heon (Disaster Information Service Lab., Korea Institute of Science and Technology Information)
  • 투고 : 2015.08.05
  • 심사 : 2015.09.16
  • 발행 : 2015.09.30

초록

자연재해 빅데이터 분석은 현재 소셜 미디어 데이터 등 텍스트 데이터를 중심으로 시작되고 있으며 이는 재난관리의 네 단계인 예방, 대비, 대응, 복구에서 마지막 두 단계에 주로 해당된다. 반면 기상 데이터 자체에 대한 빅데이터 분석은 사전 관리에 해당하는 예방, 대비 단계에 활용될 수 있어 이와 관련한 연구 사례에 대한 체계적인 정리가 필요하다. 본 논문은 리뷰 논문으로서, 자연재해 영역에서 텍스트 데이터 외의 빅데이터를 다루는 분석 기술들에 대해 소개한다. 이를 위해 기상 관련 분야에서 사용되고 있는 데이터 마이닝 및 기계 학습 기술들을 살피고 각 기상 데이터의 특성에 맞춰 기존의 기술들이 어떻게 변형되는 지 밝힌다. 우선 2절에서 빅데이터, 데이터 마이닝, 기계 학습에 대한 기본 개념을 설명하고 3절에서 데이터 마이닝 및 기계 학습 기술의 실제 적용 사례를 상세히 정리한다. 4절에서는 자연재해 대응에 이러한 기술들이 직접 활용되는 예를 소개하고 마지막에 결론으로 마무리한다.

Big data analysis for disaster have been recently started especially to text data such as social media. Social data usually supports for the final two stages of disaster management, which consists of four stages: prevention, preparation, response and recovery. Otherwise, big data analysis for meteorologic data can contribute to the prevention and preparation. This motivated us to review big data technologies dealing with non-text data rather than text in natural disaster area. To this end, we first explain the main keywords, big data, data mining and machine learning in sec. 2. Then we introduce the state-of-the-art machine learning techniques in meteorology-related field sec. 3. We show how the traditional machine learning techniques have been adapted for climatic data by taking into account the domain specificity. The application of these techniques in natural disaster response are then introduced (sec. 4), and we finally conclude with several future research directions.

키워드

참고문헌

  1. Alexandridis, A., Chondrodima, E., Efthimiou, E., Papadakis, G., Vallianatos, F., and Triantis, D. (2014). Large earthquake occurrence estimation based on radial basis function neural networks. Geoscience and Remote Sensing, IEEE Transactions on, 52, 5443-5453. https://doi.org/10.1109/TGRS.2013.2288979
  2. Angayarkkani, K. and Radhakrishnan, N. (2010). An intelligent system for effective forest fire detection using spatial data. International Journal of Computer Science and Information Security, 7, 202-208.
  3. Arrue, B. C., Ollero, A. and Matinez de Dios, J. R. (2000). An intelligent system for false alarm reduction in infrared forest-fire detection. Intelligent Systems and their Applications, IEEE, 15, 64-73.
  4. Chakraborty, P., Marwah, M., Arlitt, M. and Ramakrishnan, N. (2012). Fine-grained photovoltaic output prediction using a bayesian ensemble. Twenty-Sixth Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence, Toronto, Canada.
  5. Chang, F. J. and Wang, K. W. (2013). A systematical water allocation scheme for drought mitigation. Journal of Hydrology, 507, 124-133. https://doi.org/10.1016/j.jhydrol.2013.10.027
  6. Choi, H., Park, H. W. and Park, C. (2013). Support vector machines for big data analysis. Journal of the Korean Data & Information Science Society, 24, 989-998. https://doi.org/10.7465/jkdi.2013.24.5.989
  7. Cortes, C. and Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20, 273-297.
  8. Cortez, P. and Morais, A. (2007). A data mining approach to predict forest fires using meteorological data. Proceedings of the 13th EPIA2007- Portuguese conference on artificial intelligence, Guimar.aes, Portugal, 512-523.
  9. Donges, J. F., Zou, Y., Marwan, N. and Kurths, J. (2009). Complex networks in climate dynamics. The European Physical Journal Special Topics, 174, 157-179. https://doi.org/10.1140/epjst/e2009-01098-2
  10. Dos Santos, J. A., Gosselin, P. H., Philipp-Foliguet, S., Torres, R. S. and Falao, A. X. (2012). Multiscale classification of remote sensing images. Geoscience and Remote Sensing, 50, 3764-3775. https://doi.org/10.1109/TGRS.2012.2186582
  11. Drucker, H., Burges, C. J. C., Kaufman, L., Smola, A. and Vapnik, V. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155-161.
  12. Foley, A. M., Leahy, P. G., Marvuglia, A. and McKeogh, E. J. (2012). Current methods and advances in forecasting of wind power generation. Renewable Energy, 37, 1-8. https://doi.org/10.1016/j.renene.2011.05.033
  13. He, H., Cao, Y., Cao, Y. and Wen, J. (2011). Ensemble learning for wind profile prediction with missing values. Neural Computing and Applications, 22, 287-294.
  14. Jursa, R. and Rohrig, K. (2008). Short-term wind power forecasting using evolutionary algorithms for the automated specification of artificial intelligence models. International Journal of Forecasting, 24, 694-709. https://doi.org/10.1016/j.ijforecast.2008.08.007
  15. Kalra, A., Miller, W. P., Lamb, K. W., Ahmad, S. and Piechota, T. (2013). Using large-scale climatic patterns for improving long lead time streamflow forecasts for Gunnison and San Juan River Basins. Hydrological Process, 27, 1543-1559. https://doi.org/10.1002/hyp.9236
  16. Kim, Y. and Cho, K. H. (2013). Big data and statistics. Journal of the Korean Data & Information Science Society, 24, 959-974. https://doi.org/10.7465/jkdi.2013.24.5.959
  17. Kulahci, F., Inceoz, M., Dogrua, M., Aksoyb, E. and Baykara, O. (2009). Artificial neural network model for earthquake prediction with radon monitoring. Applied Radiation and Isotopes, 67, 212-219. https://doi.org/10.1016/j.apradiso.2008.08.003
  18. Kusiak, A., Wei, X., Verma, A.P. and Roz, E. (2012). Modeling and prediction of rainfall using radar reflectivity data: A data-mining approach. IEEE Transactions on Geoscience and Remote Sensing, 51, 2337-2342.
  19. Li, X., Plale, B., Vijayakumar, N., Ramachandran, R., Graves, S. and Conover, H. (2008). Real-time storm detection and weather forecast activation through data mining and events processing. Earth Science Informatics, 1, 49-57. https://doi.org/10.1007/s12145-008-0010-7
  20. Magoulas, R. and Lorica, B. (2009). Introduction to big data. Release 2.0, 11.
  21. Mellit, A., Massi Pavan, A. and Benghanem, M. (2012). Least squares support vector machine for shortterm prediction of meteorological time series. Theoretical and Applied Climatology, 111, 297-307.
  22. Mohandes, M. A., Halawani, T. O., Rehman, S. and Hussain, A. A. (2004). Support vector machines for wind speed prediction. Renewable Energy, 29, 939-947. https://doi.org/10.1016/j.renene.2003.11.009
  23. Moustra, M., Avraamides, M. and Christodoulou, C. (2011). Artificial neural networks for earthquake prediction using time series magnitude data or seismic electric signals. Expert Systems with Applications, 38, 15032-15039. https://doi.org/10.1016/j.eswa.2011.05.043
  24. Muttil, N. and Chau, K. W. (2007). Machine-learning paradigms for selecting ecologically significant input variables. Engineering Applications of Artificial Intelligence, 20, 735-744. https://doi.org/10.1016/j.engappai.2006.11.016
  25. Ohashi, O. and Torgo, L. (2012). Wind speed forecasting using spatio-temporal indicators. 20th European Conference on Artificial Intelligence, 242, 975-980.
  26. Panakkat, A. and Adeli, H. (2008). Recent efforts in earthquake prediction (1990-2007). Natural Hazards Review, 9, 70-80. https://doi.org/10.1061/(ASCE)1527-6988(2008)9:2(70)
  27. Peters, J., Verhoest, N. E. C., Samson, R., Meirvenne, M. V., Cockx, L. and Baets, B. D. (2009). Uncertainty propagation in vegetation distribution models based on ensemble classifiers. Ecological Modelling, 220, 791-804. https://doi.org/10.1016/j.ecolmodel.2008.12.022
  28. Petropoulos, G. P., Arvanitis, K. and Sigrimis, N. (2012). Hyperion hyperspectral imagery analysis combined with machine learning classifiers for land use/cover mapping. Expert Systems with Applications, 39, 3800-3809. https://doi.org/10.1016/j.eswa.2011.09.083
  29. Race, C., Steinbach, M, Ganguly, A. R., Semazzi, F. and Kumar, V. (2010). A knowledge discovery strategy for relating sea surface temperatures to frequencies of tropical storms and generating predictions of hurricanes under 21st-century global warming scenarios. Conference on Intelligent Data Understanding, Mountain View, California, USA. 204-212.
  30. Radhika, Y. and Shashi, M. (2009). Atmospheric temperature prediction using support vector machines. International Journal of Computer Theory and Engineering, 1, 1793-8201.
  31. Rasouli, K., Hsieha, W. W. and Cannon, A. J. (2012). Daily streamflow forecasting by machine learning methods with weather and climate inputs. Journal of Hydrology, 414, 284-293.
  32. Sakr, G. E., Elhajj, I. H., Mitri, G. andWejinya, U. C. (2010). Artificial intelligence for forest fire prediction. Advanced Intelligent Mechatronics (AIM), 2010 IEEE/ASME International Conference on, 1311-1316, IEEE.
  33. Sharma, N., Sharma, P., Irwin, D. and Shenoy, P. (2011). Predicting solar generation from weather forecasts using machine learning. IEEE, 528-533.
  34. Snell, S. E., Gopal, S. and Kaufmann, R. K. (2000). Spatial interpolation of surface air temperatures using artificial neural networks: evaluating their use for downscaling GCMs. Journal of Climate, 13, 886-895. https://doi.org/10.1175/1520-0442(2000)013<0886:SIOSAT>2.0.CO;2
  35. Steinbach, M., Tan, P. N., Kumar, V., Klooster, S. and Potter, C. (2003). Discovery of climate indices using clustering. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, DC, U.S.A. 446-455.
  36. Steinhaeuser, K., Chawla, N. V. and Ganguly, A. R. (2010). Complex networks as a unified framework for descriptive analysis and predictive modeling in climate. Statistical Analysis and Data Mining: The ASA Data Science Journal, 4, 497-511.
  37. Steinhaeuser, K., Chawla, N. V. and Ganguly, A. R. (2011). Comparing predictive power in climate data: clustering matters. Advances in Spatial and Temporal Databases, Lecture Notes in Computer Science, 6849, 39-55.
  38. Storch, H. V. and Zwiers, F. W. (1999). Statistical analysis in climate research, Cambridge University Press, United Kingdom.
  39. Tasadduq, I., Rehman, S. and Bubshait, K. (2002). Application of neural networks for the prediction of hourly mean surface temperatures in Saudi Arabia. Renew Energy, 25, 545-554. https://doi.org/10.1016/S0960-1481(01)00082-9
  40. Tsonis, A. A., Swanson, K. L. and Roebber, P. J. (2006). What do networks have to do with climate?. BAMS, 87, 585-595. https://doi.org/10.1175/BAMS-87-5-585
  41. Vatsavai, R. R., Bright, E., Varun, C., Budhendra, B., Cheriyadat, A. and Grasser, J. (2011). Machine learning approaches for high-resolution urban land cover classification: A comparative study. 2nd International Conference on Computing for Geospatial Research & Applications, 11, 1-10.
  42. Vega-Garcia, C., Lee, B. S., Woodard, P. M. and Titus, S. J. (1996). Applying neural network technology to human-caused wildfire occurrence prediction. AI Applications, 10, 9-18.
  43. Vincenzi, S., Zucchetta, M., Franzoi, P., Pellizzato, M., Pranovi, F., De Leo, G. A. and Torricelli, P. (2011). Application of a random forest algorithm to predict spatial distribution of the potential yield of Ruditapes philippinarum in the Venice lagoon, Italy. Ecological Modelling, 222, 1471-1478. https://doi.org/10.1016/j.ecolmodel.2011.02.007
  44. Voyant, C., Muselli, M., Paoli, C. and Nivet, M. L. (2011). Optimization of an artificial neural network dedicated to the multivariate forecasting of daily global radiation. Energy, 36, 348-359. https://doi.org/10.1016/j.energy.2010.10.032

피인용 문헌

  1. Study on Recognitions of Luxury Brands by Using Social Big Data vol.18, pp.1, 2016, https://doi.org/10.5805/SFTI.2016.18.1.1
  2. Development of Heavy Rain Damage Prediction Functions in the Seoul Capital Area Using Machine Learning Techniques vol.18, pp.7, 2018, https://doi.org/10.9798/KOSHAM.2018.18.7.435
  3. 데이터 마이닝을 활용한 북한 산림과학 연구 동향 분석(1962~2016) vol.109, pp.1, 2020, https://doi.org/10.14578/jkfs.2020.109.1.81