• 제목/요약/키워드: Machine-Learning

Search Result 5,627, Processing Time 0.045 seconds

Machine Learning Method for Improving WRF-Hydro streamflow prediction (WRF-Hydro 하천수 예측 개선을 위한 머신러닝 기법의 활용)

  • Cho, Kyeungwoo;Choi, Suyeon;Chi, Haewon;Kim, Yeonjoo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.63-63
    • /
    • 2020
  • 최근 머신러닝 기술의 발전에 따라 비선형 시계열자료에 대한 예측이 가능해졌으며, 기존의 과정기반모형을 대체하여 지하수, 하천수 예측 등 다양한 수문분야에 활용되고 있다. 본 연구에서는 기존의 연구들과 달리 과정기반모형을 이용한 하천수 모의결과를 개선하기 위해 과정기반모형과 결합하는 방식으로 머신러닝 기술을 활용하였다. 머신러닝 기술을 통해 관측값과 모의값 간의 차이를 예측하고 과정기반모형의 모의결과에 반영함으로써 관측값을 정확히 재현할 수 있도록 하는 시스템을 구축하고 평가하였다. 과정기반모형으로는 Weather Research and Forecasting model-Hydrological modeling system (WRF-Hydro)을 소양강 유역을 대상으로 구축하였다. 머신러닝 모형으로는 순환 신경망 중 하나인 Long Short-Term Memory (LSTM) 신경망을 이용하여 장기시계열예측이 가능하게 하였다(WRF-Hydro-LSTM). 머신러닝 모형은 2013년부터 2017년까지의 기상자료 및 유입량 잔차를 이용하여 학습시키고, 2018년 기상자료를 이용하여 예상되는 유입량 잔차를 모의하였다. 모의된 잔차를 WRF-Hydro 모의결과에 반영시켜 최종 유입량 모의값을 보정하였다. 또한, 연구에서 제안된 새로운 방법론의 성능을 비교평가하기 위해 머신러닝 단독 모형으로 유입량을 학습 후 모의하였다(LSTM-only). 상관계수와 Nash-Sutcliffe 효율계수(NSE)를 사용해 평가한 결과, LSTM을 이용한 두 방법(WRF-Hydro-LSTM과 LSTM-only) 모두 기존의 과정기반모형(WRF-Hydro-only)에 비해 높은 정확도의 하천수 모의가 가능했으며, PBIAS 지수를 사용하여 평가한 결과, LSTM을 단독으로 사용하였을 때보다 WRF-Hydro와 결합했을 때 더 관측값과 가까운 모의가 가능함을 확인할 수 있었다.

  • PDF

Evaluation of Rainfall Erosivity Factor Estimation Using Machine and Deep Learning Models (머신러닝 및 딥러닝을 활용한 강우침식능인자 예측 평가)

  • Lee, Jimin;Lee, Seoro;Lee, Gwanjae;Kim, Jonggun;Lim, Kyoung Jae
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.450-450
    • /
    • 2021
  • 기후변화 보고서에 따르면 집중 호우의 강도 및 빈도 증가가 향후 몇 년동안 지속될 것이라 제시하였다. 이러한 집중호우가 빈번히 발생하게 된다면 강우 침식성이 증가하여 표토 침식에 더 취약하게 발생된다. Universal Soil Loss Equation (USLE) 입력 매개 변수 중 하나인 강우침식능인자는 토양 유실을 예측할때 강우 강도의 미치는 영향을 제시하는 인자이다. 선행 연구에서 USLE 방법을 사용하여 강우침식능인자를 산정하였지만, 60분 단위 강우자료를 이용하였기 때문에 정확한 30분 최대 강우강도 산정을 고려하지 못하는 한계점이 있다. 본 연구의 목적은 강우침식능인자를 이전의 진행된 방법보다 더 빠르고 정확하게 예측하는 머신러닝 모델을 개발하며, 총 월별 강우량, 최대 일 강우량 및 최대 시간별 강우량 데이터만 있어도 산정이 가능하도록 하였다. 이를 위해 본 연구에서는 강우침식능인자의 산정 값의 정확도를 높이기 위해 1분 간격 강우 데이터를 사용하며, 최근 강우 패턴을 반영하기 위해서 2013-2019년 자료로 이용했다. 우선, 월별 특성을 파악하기 위해 USLE 계산 방법을 사용하여 월별 강우침식능인자를 산정하였고, 국내 50개 지점을 대상으로 계산된 월별 강우침식능인자를 실측 값으로 정하여, 머신러닝 모델을 통하여 강우침식능인자 예측하도록 학습시켜 분석하였다. 이 연구에 사용된 머신러닝 모델들은 Decision Tree, Random Forest, K-Nearest Neighbors, Gradient Boosting, eXtreme Gradient Boost 및 Deep Neural Network을 이용하였다. 또한, 교차 검증을 통해서 모델 중 Deep Neural Network이 강우침식능인자 예측 정확도가 가장 높게 산정하였다. Deep Neural Network은 Nash-Sutcliffe Efficiency (NSE) 와 Coefficient of determination (R2)의 결과값이 0.87로서 모델의 예측성을 입증하였으며, 검증 모델을 테스트 하기 위해 국내 6개 지점을 무작위로 선별하여 강우침식능인자를 분석하였다. 본 연구 결과에서 나온 Deep Neural Network을 이용하면, 훨씬 적은 노력과 시간으로 원하는 지점에서 월별 강우침식능인자를 예측할 수 있으며, 한국 강우 패턴을 효율적으로 분석 할 수 있을 것이라 판단된다. 이를 통해 향후 토양 침식 위험을 지표화하는 것뿐만 아니라 토양 보전 계획을 수립할 수 있으며, 위험 지역을 우선적으로 선별하고 제시하는데 유용하게 사용 될 것이라 사료된다.

  • PDF

Effective Drought Prediction Based on Machine Learning (머신러닝 기반 효과적인 가뭄예측)

  • Kim, Kyosik;Yoo, Jae Hwan;Kim, Byunghyun;Han, Kun-Yeun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.326-326
    • /
    • 2021
  • 장기간에 걸쳐 넓은 지역에 대해 발생하는 가뭄을 예측하기위해 많은 학자들의 기술적, 학술적 시도가 있어왔다. 본 연구에서는 복잡한 시계열을 가진 가뭄을 전망하는 방법 중 시나리오에 기반을 둔 가뭄전망 방법과 실시간으로 가뭄을 예측하는 비시나리오 기반의 방법 등을 이용하여 미래 가뭄전망을 실시했다. 시나리오에 기반을 둔 가뭄전망 방법으로는, 3개월 GCM(General Circulation Model) 예측 결과를 바탕으로 2009년도 PDSI(Palmer Drought Severity Index) 가뭄지수를 산정하여 가뭄심도에 대한 단기예측을 실시하였다. 또, 통계학적 방법과 물리적 모델(Physical model)에 기반을 둔 확정론적 수치해석 방법을 이용하여 비시나리오 기반 가뭄을 예측했다. 기존 가뭄을 통계학적 방법으로 예측하기 위해서 시도된 대표적인 방법으로 ARIMA(Autoregressive Integrated Moving Average) 모델의 예측에 대한 한계를 극복하기위해 서포트 벡터 회귀(support vector regression, SVR)와 웨이블릿(wavelet neural network) 신경망을 이용해 SPI를 측정하였다. 최적모델구조는 RMSE(root mean square error), MAE(mean absolute error) 및 R(correlation Coefficient)를 통해 선정하였고, 1-6개월의 선행예보 시간을 갖고 가뭄을 전망하였다. 그리고 SPI를 이용하여, 마코프 연쇄(Markov chain) 및 대수선형모델(log-linear model)을 적용하여 SPI기반 가뭄예측의 정확도를 검증하였으며, 터키의 아나톨리아(Anatolia) 지역을 대상으로 뉴로퍼지모델(Neuro-Fuzzy)을 적용하여 1964-2006년 기간의 월평균 강수량과 SPI를 바탕으로 가뭄을 예측하였다. 가뭄 빈도와 패턴이 불규칙적으로 변하며 지역별 강수량의 양극화가 심화됨에 따라 가뭄예측의 정확도를 높여야 하는 요구가 커지고 있다. 본 연구에서는 복잡하고 비선형성으로 이루어진 가뭄 패턴을 기상학적 가뭄의 정도를 나타내는 표준강수증발지수(SPEI, Standardized Precipitation Evapotranspiration Index)인 월SPEI와 일SPEI를 기계학습모델에 적용하여 예측개선 모형을 개발하고자 한다.

  • PDF

Spatio-temporal potential future drought prediction using machine learning for time series data forecast in Abomey-calavi (South of Benin)

  • Agossou, Amos;Kim, Do Yeon;Yang, Jeong-Seok
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.268-268
    • /
    • 2021
  • Groundwater resource is mostly used in Abomey-calavi (southern region of Benin) as main source of water for domestic, industrial, and agricultural activities. Groundwater intake across the region is not perfectly controlled by a network due to the presence of many private boreholes and traditional wells used by the population. After some decades, this important resource is becoming more and more vulnerable and needs more attention. For a better groundwater management in the region of Abomey-calavi, the present study attempts to predict a future probable groundwater drought using Recurrent Neural Network (RNN) for future groundwater level prediction. The RNN model was created in python using jupyter library. Six years monthly groundwater level data was used for the model calibration, two years data for the model test and the model was finaly used to predict two years future groundwater level (years 2020 and 2021). GRI was calculated for 9 wells across the area from 2012 to 2021. The GRI value in dry season (by the end of March) showed groundwater drought for the first time during the study period in 2014 as severe and moderate; from 2015 to 2021 it shows only moderate drought. The rainy season in years 2020 and 2021 is relatively wet and near normal. GRI showed no drought in rainy season during the study period but an important diminution of groundwater level between 2012 and 2021. The Pearson's correlation coefficient calculated between GRI and rainfall from 2005 to 2020 (using only three wells with times series long period data) proved that the groundwater drought mostly observed in dry season is not mainly caused by rainfall scarcity (correlation values between -0.113 and -0.083), but this could be the consequence of an overexploitation of the resource which caused the important spatial and temporal diminution observed from 2012 to 2021.

  • PDF

A Research on the Method of Automatic Metadata Generation of Video Media for Improvement of Video Recommendation Service (영상 추천 서비스의 개선을 위한 영상 미디어의 메타데이터 자동생성 방법에 대한 연구)

  • You, Yeon-Hwi;Park, Hyo-Gyeong;Yong, Sung-Jung;Moon, Il-Young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.281-283
    • /
    • 2021
  • The representative companies mentioned in the recommendation service in the domestic OTT(Over-the-top media service) market are YouTube and Netflix. YouTube, through various methods, started personalized recommendations in earnest by introducing an algorithm to machine learning that records and uses users' viewing time from 2016. Netflix categorizes users by collecting information such as the user's selected video, viewing time zone, and video viewing device, and groups people with similar viewing patterns into the same group. It records and uses the information collected from the user and the tag information attached to the video. In this paper, we propose a method to improve video media recommendation by automatically generating metadata of video media that was written by hand.

  • PDF

An optimized ANFIS model for predicting pile pullout resistance

  • Yuwei Zhao;Mesut Gor;Daria K. Voronkova;Hamed Gholizadeh Touchaei;Hossein Moayedi;Binh Nguyen Le
    • Steel and Composite Structures
    • /
    • v.48 no.2
    • /
    • pp.179-190
    • /
    • 2023
  • Many recent attempts have sought accurate prediction of pile pullout resistance (Pul) using classical machine learning models. This study offers an improved methodology for this objective. Adaptive neuro-fuzzy inference system (ANFIS), as a popular predictor, is trained by a capable metaheuristic strategy, namely equilibrium optimizer (EO) to predict the Pul. The used data is collected from laboratory investigations in previous literature. First, two optimal configurations of EO-ANFIS are selected after sensitivity analysis. They are next evaluated and compared with classical ANFIS and two neural-based models using well-accepted accuracy indicators. The results of all five models were in good agreement with laboratory Puls (all correlations > 0.99). However, it was shown that both EO-ANFISs not only outperform neural benchmarks but also enjoy a higher accuracy compared to the classical version. Therefore, utilizing the EO is recommended for optimizing this predictive tool. Furthermore, a comparison between the selected EO-ANFISs, where one employs a larger population, revealed that the model with the population size of 75 is more efficient than 300. In this relation, root mean square error and the optimization time for the EO-ANFIS (75) were 19.6272 and 1715.8 seconds, respectively, while these values were 23.4038 and 9298.7 seconds for EO-ANFIS (300).

Infrastructure Anomaly Analysis for Data-center Failure Prevention: Based on RRCF and Prophet Ensemble Analysis (데이터센터 장애 예방을 위한 인프라 이상징후 분석: RRCF와 Prophet Ensemble 분석 기반)

  • Hyun-Jong Kim;Sung-Keun Kim;Byoung-Whan Chun;Kyong-Bog, Jin;Seung-Jeong Yang
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.113-124
    • /
    • 2022
  • Various methods using machine learning and big data have been applied to prevent failures in Data Centers. However, there are many limitations to referencing individual equipment-based performance indicators or to being practically utilized as an approach that does not consider the infrastructure operating environment. In this study, the performance indicators of individual infrastructure equipment are integrated monitoring and the performance indicators of various equipment are segmented and graded to make a single numerical value. Data pre-processing based on experience in infrastructure operation. And an ensemble of RRCF (Robust Random Cut Forest) analysis and Prophet analysis model led to reliable analysis results in detecting anomalies. A failure analysis system was implemented to facilitate the use of Data Center operators. It can provide a preemptive response to Data Center failures and an appropriate tuning time.

Design and development of non-contact locks including face recognition function based on machine learning (머신러닝 기반 안면인식 기능을 포함한 비접촉 잠금장치 설계 및 개발)

  • Yeo Hoon Yoon;Ki Chang Kim;Whi Jin Jo;Hongjun Kim
    • Convergence Security Journal
    • /
    • v.22 no.1
    • /
    • pp.29-38
    • /
    • 2022
  • The importance of prevention of epidemics is increasing due to the serious spread of infectious diseases. For prevention of epidemics, we need to focus on the non-contact industry. Therefore, in this paper, a face recognition door lock that controls access through non-contact is designed and developed. First very simple features are combined to find objects and face recognition is performed using Haar-based cascade algorithm. Then the texture of the image is binarized to find features using LBPH. An non-contact door lock system which composed of Raspberry PI 3B+ board, an ultrasonic sensor, a camera module, a motor, etc. are suggested. To verify actual performance and ascertain the impact of light sources, various experiment were conducted. As experimental results, the maximum value of the recognition rate was about 85.7%.

A Comparative Study of Predictive Factors for Hypertension using Logistic Regression Analysis and Decision Tree Analysis

  • SoHyun Kim;SungHyoun Cho
    • Physical Therapy Rehabilitation Science
    • /
    • v.12 no.2
    • /
    • pp.80-91
    • /
    • 2023
  • Objective: The purpose of this study is to identify factors that affect the incidence of hypertension using logistic regression and decision tree analysis, and to build and compare predictive models. Design: Secondary data analysis study Methods: We analyzed 9,859 subjects from the Korean health panel annual 2019 data provided by the Korea Institute for Health and Social Affairs and National Health Insurance Service. Frequency analysis, chi-square test, binary logistic regression, and decision tree analysis were performed on the data. Results: In logistic regression analysis, those who were 60 years of age or older (Odds ratio, OR=68.801, p<0.001), those who were divorced/widowhood/separated (OR=1.377, p<0.001), those who graduated from middle school or younger (OR=1, reference), those who did not walk at all (OR=1, reference), those who were obese (OR=5.109, p<0.001), and those who had poor subjective health status (OR=2.163, p<0.001) were more likely to develop hypertension. In the decision tree, those over 60 years of age, overweight or obese, and those who graduated from middle school or younger had the highest probability of developing hypertension at 83.3%. Logistic regression analysis showed a specificity of 85.3% and sensitivity of 47.9%; while decision tree analysis showed a specificity of 81.9% and sensitivity of 52.9%. In classification accuracy, logistic regression and decision tree analysis showed 73.6% and 72.6% prediction, respectively. Conclusions: Both logistic regression and decision tree analysis were adequate to explain the predictive model. It is thought that both analysis methods can be used as useful data for constructing a predictive model for hypertension.

Development of Social Data Collection and Loading Engine-based Reliability analysis System Against Infectious Disease Pandemic (감염병 위기 대응을 위한 소셜 데이터 수집 및 적재 엔진 기반 신뢰도 분석 시스템 개발)

  • Doo Young Jung;Sang-Jun Lee;MIN KYUNG IL;Seogsong Jeong;HyunWook Han
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.103-111
    • /
    • 2022
  • There are many institutions, organizations, and sites related to responding to infectious diseases, but as the pandemic situation such as COVID-19 continues for years, there are many changes in the initial and current aspects, and accordingly, policies and response systems are evolving. As a result, regional gaps arise, and various problems are scattered due to trust, distrust, and implementation of policies. Therefore, in the process of analyzing social data including information transmission, Twitter data, one of the major social media platforms containing inaccurate information from unknown sources, was developed to prevent facts in advance. Based on social data, which is unstructured data, an algorithm that can automatically detect infectious disease threats is developed to create an objective basis for responding to the infectious disease crisis to solidify international competitiveness in related fields.