• Title/Summary/Keyword: Model Ensemble

Search Result 638, Processing Time 0.024 seconds

Decentralized Structural Diagnosis and Monitoring System for Ensemble Learning on Dynamic Characteristics (동특성 앙상블 학습 기반 구조물 진단 모니터링 분산처리 시스템)

  • Shin, Yoon-Soo;Min, Kyung-Won
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.34 no.4
    • /
    • pp.183-189
    • /
    • 2021
  • In recent years, active research has been devoted toward developing a monitoring system using ambient vibration data in order to quantitatively determine the deterioration occurring in a structure over a long period of time. This study developed a low-cost edge computing system that detects the abnormalities in structures by utilizing the dynamic characteristics acquired from the structure over the long term for ensemble learning. The system hardware consists of the Raspberry Pi, an accelerometer, an inclinometer, a GPS RTK module, and a LoRa communication module. The structural abnormality detection afforded by the ensemble learning using dynamic characteristics is verified using a laboratory-scale structure model vibration experiment. A real-time distributed processing algorithm with dynamic feature extraction based on the experiment is installed on the Raspberry Pi. Based on the stable operation of installed systems at the Community Service Center, Pohang-si, Korea, the validity of the developed system was verified on-site.

Comparative assessment and uncertainty analysis of ensemble-based hydrologic data assimilation using airGRdatassim (airGRdatassim을 이용한 앙상블 기반 수문자료동화 기법의 비교 및 불확실성 평가)

  • Lee, Garim;Lee, Songhee;Kim, Bomi;Woo, Dong Kook;Noh, Seong Jin
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.10
    • /
    • pp.761-774
    • /
    • 2022
  • Accurate hydrologic prediction is essential to analyze the effects of drought, flood, and climate change on flow rates, water quality, and ecosystems. Disentangling the uncertainty of the hydrological model is one of the important issues in hydrology and water resources research. Hydrologic data assimilation (DA), a technique that updates the status or parameters of a hydrological model to produce the most likely estimates of the initial conditions of the model, is one of the ways to minimize uncertainty in hydrological simulations and improve predictive accuracy. In this study, the two ensemble-based sequential DA techniques, ensemble Kalman filter, and particle filter are comparatively analyzed for the daily discharge simulation at the Yongdam catchment using airGRdatassim. The results showed that the values of Kling-Gupta efficiency (KGE) were improved from 0.799 in the open loop simulation to 0.826 in the ensemble Kalman filter and to 0.933 in the particle filter. In addition, we analyzed the effects of hyper-parameters related to the data assimilation methods such as precipitation and potential evaporation forcing error parameters and selection of perturbed and updated states. For the case of forcing error conditions, the particle filter was superior to the ensemble in terms of the KGE index. The size of the optimal forcing noise was relatively smaller in the particle filter compared to the ensemble Kalman filter. In addition, with more state variables included in the updating step, performance of data assimilation improved, implicating that adequate selection of updating states can be considered as a hyper-parameter. The simulation experiments in this study implied that DA hyper-parameters needed to be carefully optimized to exploit the potential of DA methods.

Effect of input variable characteristics on the performance of an ensemble machine learning model for algal bloom prediction (앙상블 머신러닝 모형을 이용한 하천 녹조발생 예측모형의 입력변수 특성에 따른 성능 영향)

  • Kang, Byeong-Koo;Park, Jungsu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.35 no.6
    • /
    • pp.417-424
    • /
    • 2021
  • Algal bloom is an ongoing issue in the management of freshwater systems for drinking water supply, and the chlorophyll-a concentration is commonly used to represent the status of algal bloom. Thus, the prediction of chlorophyll-a concentration is essential for the proper management of water quality. However, the chlorophyll-a concentration is affected by various water quality and environmental factors, so the prediction of its concentration is not an easy task. In recent years, many advanced machine learning algorithms have increasingly been used for the development of surrogate models to prediction the chlorophyll-a concentration in freshwater systems such as rivers or reservoirs. This study used a light gradient boosting machine(LightGBM), a gradient boosting decision tree algorithm, to develop an ensemble machine learning model to predict chlorophyll-a concentration. The field water quality data observed at Daecheong Lake, obtained from the real-time water information system in Korea, were used for the development of the model. The data include temperature, pH, electric conductivity, dissolved oxygen, total organic carbon, total nitrogen, total phosphorus, and chlorophyll-a. First, a LightGBM model was developed to predict the chlorophyll-a concentration by using the other seven items as independent input variables. Second, the time-lagged values of all the input variables were added as input variables to understand the effect of time lag of input variables on model performance. The time lag (i) ranges from 1 to 50 days. The model performance was evaluated using three indices, root mean squared error-observation standard deviation ration (RSR), Nash-Sutcliffe coefficient of efficiency (NSE) and mean absolute error (MAE). The model showed the best performance by adding a dataset with a one-day time lag (i=1) where RSR, NSE, and MAE were 0.359, 0.871 and 1.510, respectively. The improvement of model performance was observed when a dataset with a time lag up of about 15 days (i=15) was added.

Generation of radar rainfall ensemble using probabilistic approach (확률론적 방법론을 이용한 레이더 강우 앙상블 생성)

  • Kang, Narae;Joo, Hongjun;Lee, Myungjin;Kim, Hung Soo
    • Journal of Korea Water Resources Association
    • /
    • v.50 no.3
    • /
    • pp.155-167
    • /
    • 2017
  • Accurate QPE (Quantitative Precipitation Estimation) and the quality of the rainfall data for hydrological analysis are very important factors. Especially, the quality has a great influence on flood runoff result. It needs to know characteristics of the uncertainties in radar QPE for the reliable flood analysis. The purpose of this study is to present a probabilistic approach which defines the range of possible values or probabilistic distributions rather than a single value to consider the uncertainties in radar QPE and evaluate its applicability by applying it to radar rainfall. This study generated radar rainfall ensemble for the storms by the typhoon 'Sanba' on Namgang dam basin, Korea. It was shown that the rainfall ensemble is able to simulate well the pattern of the rain-gauge rainfall as well as to correct well the overall bias of the radar rainfall. The suggested ensemble technique represented well the uncertainties of radar QPE. As a result, the rainfall ensemble model by a probabilistic approach can provide various rainfall scenarios which is a useful information for a decision making such as flood forecasting and warning.

Evaluation of Multi-classification Model Performance for Algal Bloom Prediction Using CatBoost (머신러닝 CatBoost 다중 분류 알고리즘을 이용한 조류 발생 예측 모형 성능 평가 연구)

  • Juneoh Kim;Jungsu Park
    • Journal of Korean Society on Water Environment
    • /
    • v.39 no.1
    • /
    • pp.1-8
    • /
    • 2023
  • Monitoring and prediction of water quality are essential for effective river pollution prevention and water quality management. In this study, a multi-classification model was developed to predict chlorophyll-a (Chl-a) level in rivers. A model was developed using CatBoost, a novel ensemble machine learning algorithm. The model was developed using hourly field monitoring data collected from January 1 to December 31, 2015. For model development, chl-a was classified into class 1 (Chl-a≤10 ㎍/L), class 2 (10<Chl-a≤50 ㎍/L), and class 3 (Chl-a>50 ㎍/L), where the number of data used for the model training were 27,192, 11,031, and 511, respectively. The macro averages of precision, recall, and F1-score for the three classes were 0.58, 0.58, and 0.58, respectively, while the weighted averages were 0.89, 0.90, and 0.89, for precision, recall, and F1-score, respectively. The model showed relatively poor performance for class 3 where the number of observations was much smaller compared to the other two classes. The imbalance of data distribution among the three classes was resolved by using the synthetic minority over-sampling technique (SMOTE) algorithm, where the number of data used for model training was evenly distributed as 26,868 for each class. The model performance was improved with the macro averages of precision, rcall, and F1-score of the three classes as 0.58, 0.70, and 0.59, respectively, while the weighted averages were 0.88, 0.84, and 0.86 after SMOTE application.

Study on Control Model Based on Signal Processing In End-Milling Process (엔드밀 공정에서의 신호처리에 따른 제어모델에 관한 연구)

  • 양우석;이건복
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 2001.04a
    • /
    • pp.192-196
    • /
    • 2001
  • This work describes the modeling of cutting process for feedback control based on signal processing in end-milling. Here, cutting force is used to design control model by a variety of schemes which are moving average, ensemble average, peak value, root mean square and analog low-pass filtering. It is expected that each model offers its own peculiar advantage in following cutting force control.

  • PDF

Response of Terrestrial Carbon Cycle: Climate Variability in CarbonTracker and CMIP5 Earth System Models (기후 인자와 관련된 육상 탄소 순환 변동: 탄소추적시스템과 CMIP5 모델 결과 비교)

  • Sun, Minah;Kim, Youngmi;Lee, Johan;Boo, Kyoung-On;Byun, Young-Hwa;Cho, Chun-Ho
    • Atmosphere
    • /
    • v.27 no.3
    • /
    • pp.301-316
    • /
    • 2017
  • This study analyzes the spatio-temporal variability of terrestrial carbon flux and the response of land carbon sink with climate factors to improve of understanding of the variability of land-atmosphere carbon exchanges accurately. The coupled carbon-climate models of CMIP5 (the fifth phase of the Coupled Model Intercomparison Project) and CT (CarbonTracker) are used. The CMIP5 multi-model ensemble mean overestimated the NEP (Net Ecosystem Production) compares to CT and GCP (Global Carbon Project) estimates over the period 2001~2012. Variation of NEP in the CMIP5 ensemble mean is similar to CT, but a couple of models which have fire module without nitrogen cycle module strongly simulate carbon sink in the Africa, Southeast Asia, South America, and some areas of the United States. Result in comparison with climate factor, the NEP is highly affected by temperature and solar radiation in both of CT and CMIP5. Partial correlation between temperature and NEP indicates that the temperature is affecting NEP positively at higher than mid-latitudes in the Northern Hemisphere, but opposite correlation represents at other latitudes in CT and most CMIP5 models. The CMIP5 models except for few models show positive correlation with precipitation at $30^{\circ}N{\sim}90^{\circ}N$, but higher percentage of negative correlation represented at $60^{\circ}S{\sim}30^{\circ}N$ compare to CT. For each season, the correlation between temperature (solar radiation) and NEP in the CMIP5 ensemble mean is similar to that of CT, but overestimated.

Variability of Wind Energy in Korea Using Regional Climate Model Ensemble Projection (지역 기후 앙상블 예측을 활용한 한반도 풍력 에너지의 시·공간적 변동성 연구)

  • Kim, Yumi;Kim, Yeon-Hee;Kim, Nayun;Lim, Yoon-Jin;Kim, Baek-Jo
    • Atmosphere
    • /
    • v.26 no.3
    • /
    • pp.373-386
    • /
    • 2016
  • The future variability of Wind Energy Density (WED) over the Korean Peninsula under RCP climate change scenario is projected using ensemble analysis. As for the projection of the future WED, changes between the historical period (1981~2005) and the future projection (2021~2050) are examined by analyzing annual and seasonal mean, and Coefficient of Variation (CV) of WED. The annual mean of WED in the future is expected to decrease compared to the past ones in RCP 4.5 and RCP 8.5 respectively. However, the CV is expected to increase in RCP 8.5. WEDs in spring and summer are expected to increase in both scenarios RCP 4.5 and RCP 8.5. In particular, it is predicted that the variation of CV for WED in winter is larger than other seasons. The time series of WED for three major wind farms in Korea exhibit a decrease trend over the future period (2021~2050) in Gochang for autumn, in Daegwanryeong for spring, and in Jeju for autumn. Through analyses of the relationship between changes in wind energy and pressure gradients, the fact that changes in pressure gradients would affect changes in WED is identified. Our results can be used as a background data for devising a plan to develop and operate wind farm over the Korean Peninsula.

Analysis and Application of Power Consumption Patterns for Changing the Power Consumption Behaviors (전력소비행위 변화를 위한 전력소비패턴 분석 및 적용)

  • Jang, MinSeok;Nam, KwangWoo;Lee, YonSik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.4
    • /
    • pp.603-610
    • /
    • 2021
  • In this paper, we extract the user's power consumption patterns, and model the optimal consumption patterns by applying the user's environment and emotion. Based on the comparative analysis of these two patterns, we present an efficient power consumption method through changes in the user's power consumption behavior. To extract significant consumption patterns, vector standardization and binary data transformation methods are used, and learning about the ensemble's ensemble with k-means clustering is applied, and applying the support factor according to the value of k. The optimal power consumption pattern model is generated by applying forced and emotion-based control based on the learning results for ensemble aggregates with relatively low average consumption. Through experiments, we validate that it can be applied to a variety of windows through the number or size adjustment of clusters to enable forced and emotion-based control according to the user's intentions by identifying the correlation between the number of clusters and the consistency ratios.

Ensemble Machine Learning Model Based YouTube Spam Comment Detection (앙상블 머신러닝 모델 기반 유튜브 스팸 댓글 탐지)

  • Jeong, Min Chul;Lee, Jihyeon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.576-583
    • /
    • 2020
  • This paper proposes a technique to determine the spam comments on YouTube, which have recently seen tremendous growth. On YouTube, the spammers appeared to promote their channels or videos in popular videos or leave comments unrelated to the video, as it is possible to monetize through advertising. YouTube is running and operating its own spam blocking system, but still has failed to block them properly and efficiently. Therefore, we examined related studies on YouTube spam comment screening and conducted classification experiments with six different machine learning techniques (Decision tree, Logistic regression, Bernoulli Naive Bayes, Random Forest, Support vector machine with linear kernel, Support vector machine with Gaussian kernel) and ensemble model combining these techniques in the comment data from popular music videos - Psy, Katy Perry, LMFAO, Eminem and Shakira.