• Title/Summary/Keyword: Model Ensemble

Search Result 638, Processing Time 0.027 seconds

Improved Estimation of Hourly Surface Ozone Concentrations using Stacking Ensemble-based Spatial Interpolation (스태킹 앙상블 모델을 이용한 시간별 지상 오존 공간내삽 정확도 향상)

  • KIM, Ye-Jin;KANG, Eun-Jin;CHO, Dong-Jin;LEE, Si-Woo;IM, Jung-Ho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.25 no.3
    • /
    • pp.74-99
    • /
    • 2022
  • Surface ozone is produced by photochemical reactions of nitrogen oxides(NOx) and volatile organic compounds(VOCs) emitted from vehicles and industrial sites, adversely affecting vegetation and the human body. In South Korea, ozone is monitored in real-time at stations(i.e., point measurements), but it is difficult to monitor and analyze its continuous spatial distribution. In this study, surface ozone concentrations were interpolated to have a spatial resolution of 1.5km every hour using the stacking ensemble technique, followed by a 5-fold cross-validation. Base models for the stacking ensemble were cokriging, multi-linear regression(MLR), random forest(RF), and support vector regression(SVR), while MLR was used as the meta model, having all base model results as additional input variables. The results showed that the stacking ensemble model yielded the better performance than the individual base models, resulting in an averaged R of 0.76 and RMSE of 0.0065ppm during the study period of 2020. The surface ozone concentration distribution generated by the stacking ensemble model had a wider range with a spatial pattern similar with terrain and urbanization variables, compared to those by the base models. Not only should the proposed model be capable of producing the hourly spatial distribution of ozone, but it should also be highly applicable for calculating the daily maximum 8-hour ozone concentrations.

A Comparison Study of Ensemble Approach Using WRF/CMAQ Model - The High PM10 Episode in Busan (앙상블 방법에 따른 WRF/CMAQ 수치 모의 결과 비교 연구 - 2013년 부산지역 고농도 PM10 사례)

  • Kim, Taehee;Kim, Yoo-Keun;Shon, Zang-Ho;Jeong, Ju-Hee
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.32 no.5
    • /
    • pp.513-525
    • /
    • 2016
  • To propose an effective ensemble methods in predicting $PM_{10}$ concentration, six experiments were designed by different ensemble average methods (e.g., non-weighted, single weighted, and cluster weighted methods). The single weighted method was calculated the weighted value using both multiple regression analysis and singular value decomposition and the cluster weighted method was estimated the weighted value based on temperature, relative humidity, and wind component using multiple regression analysis. The effects of ensemble average methods were significantly better in weighted average than non-weight. The results of ensemble experiments using weighted average methods were distinguished according to methods calculating the weighted value. The single weighted average method using multiple regression analysis showed the highest accuracy for hourly $PM_{10}$ concentration, and the cluster weighted average method based on relative humidity showed the highest accuracy for daily mean $PM_{10}$ concentration. However, the result of ensemble spread analysis showed better reliability in the single weighted average method than the cluster weighted average method based on relative humidity. Thus, the single weighted average method was the most effective method in this study case.

On successive machine learning process for predicting strength and displacement of rectangular reinforced concrete columns subjected to cyclic loading

  • Bu-seog Ju;Shinyoung Kwag;Sangwoo Lee
    • Computers and Concrete
    • /
    • v.32 no.5
    • /
    • pp.513-525
    • /
    • 2023
  • Recently, research on predicting the behavior of reinforced concrete (RC) columns using machine learning methods has been actively conducted. However, most studies have focused on predicting the ultimate strength of RC columns using a regression algorithm. Therefore, this study develops a successive machine learning process for predicting multiple nonlinear behaviors of rectangular RC columns. This process consists of three stages: single machine learning, bagging ensemble, and stacking ensemble. In the case of strength prediction, sufficient prediction accuracy is confirmed even in the first stage. In the case of displacement, although sufficient accuracy is not achieved in the first and second stages, the stacking ensemble model in the third stage performs better than the machine learning models in the first and second stages. In addition, the performance of the final prediction models is verified by comparing the backbone curves and hysteresis loops obtained from predicted outputs with actual experimental data.

A Study on the Improvement of Submarine Detection Based on Mast Images Using An Ensemble Model of Convolutional Neural Networks (컨볼루션 신경망의 앙상블 모델을 활용한 마스트 영상 기반 잠수함 탐지율 향상에 관한 연구)

  • Jeong, Miae;Ma, Jungmok
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.23 no.2
    • /
    • pp.115-124
    • /
    • 2020
  • Due to the increasing threats of submarines from North Korea and other countries, ROK Navy should improve the detection capability of submarines. There are two ways to detect submarines : acoustic detection and non-acoustic detection. Since the acoustic-detection way has limitations in spite of its usefulness, it should have the complementary way. The non-acoustic detection is the way to detect submarines which are operating mast sets such as periscopes and snorkels by non-acoustic sensors. So, this paper proposes a new submarine non-acoustic detection model using an ensemble of Convolutional Neural Network models in order to automate the non-acoustic detection. The proposed model is trained to classify targets as 4 classes which are submarines, flag buoys, lighted buoys, small boats. Based on the numerical study with 10,287 images, we confirm the proposed model can achieve 91.5 % test accuracy for the non-acoustic detection of submarines.

Reducing Uncertainties in Climate Change Assessment (기후변화 영향평가의 불확실성 저감연구)

  • Lee, Jae-Kyoung;Kim, Young-Oh
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2008.05a
    • /
    • pp.345-351
    • /
    • 2008
  • 미래의 기후변화 영향평가에 있어 전지구모형(General Circulation Model)은 가장 중요한 자료 중 하나이다. 즉, 온실가스 방출(emission) 시나리오에 기초한 전지구모형의 모의결과를 이용하면 미래 수자원에 대한 정보를 얻을 수 있다. 하지만 미래 수자원은 방출 시나리오, 상세화(downscaling) 기법, 강우-유출모형, 전지구모형의 종류에 따라 크게 달라질 수 있어 매우 큰 불확실성(uncertainty)을 포함하고 있다. 이러한 불확실성을 줄이는 방법 중 하나로 전지구모형의 모의능력에 따라 가중치(weight)를 부여하고 결합(combining)하는 multi-model 앙상블(ensemble) 기법이 선진국을 중심으로 활발히 연구되고 있다. 본 연구에서는 우선 기후변화 영향평가를 위하여 국내에서 사용가능한 전지구모형을 조사하고 그 중CCSM3, CSRIO, ECHAM4, GFDL, MIRCO를 선택하였다. 한강 충주댐 유역에 대하여 과거($1980{\sim}1999$년)와 미래($2030{\sim}2049$년) 기간에 대하여 전지구모형의 기후정보를 간단한 선형보간법을 이용하여 상세화하였다. 다음으로 multi-model 앙상블 기법을 조사하였다. 본 연구에서는 Giorgi et al.(2002)이 제안한 Reliability Ensemble Average(REA) 기법을 적용하여 선형보간법으로 상세화한 전지구모형의 모의결과에 가중치를 주어 불확실성을 줄이는 연구를 수행하였다. 특히 REA를 구성하는 식 중 모형의 편차(bias) 뿐만 아니라 분산(variance)까지 고려함으로서 이를 개선하는 Modified-REA를 제안하였다. 제안한 방안을 이용하여 결합한 전지구모형의 모의결과가 기존 REA의 결과보다 기후정보의 불확실성을 더 줄일 수 있는 것으로 나타났다.

  • PDF

Water Quality Forecasting of the River Applying Ensemble Streamflow Prediction (앙상블 유출 예측기법을 적용한 하천 수질 예측)

  • Ahn, Jung Min;Ryoo, Kyong Sik;Lyu, Siwan;Lee, Sang Jin
    • Journal of Korean Society on Water Environment
    • /
    • v.28 no.3
    • /
    • pp.359-366
    • /
    • 2012
  • Accurate predictions about the water quality of a river have great importance in identifying in-stream flow and water supply requirements and solving relevant environmental problems. In this study, the effect of water release from upstream dam on the downstream water quality has been investigated by applying a hydological model combined with QUAL2E to Geum River basin. The ESP (Ensemble Stream Prediction) method, which has been validated and verified by lots of researchers, was used to predict reservoir and tributary inflow. The input parameters for a combined model to predict both hydrological characteristics and water quality were identified and optimized. In order to verify the model performance, the simulated result at Gongju station, located at the downstream from Daecheong Dam, has been compared with measured data in 2008. As a result, it was found that the proposed model simulates well the values of BOD, T-N, and T-P with an acceptable reliability.

Development of ensemble machine learning model considering the characteristics of input variables and the interpretation of model performance using explainable artificial intelligence (수질자료의 특성을 고려한 앙상블 머신러닝 모형 구축 및 설명가능한 인공지능을 이용한 모형결과 해석에 대한 연구)

  • Park, Jungsu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.36 no.4
    • /
    • pp.239-248
    • /
    • 2022
  • The prediction of algal bloom is an important field of study in algal bloom management, and chlorophyll-a concentration(Chl-a) is commonly used to represent the status of algal bloom. In, recent years advanced machine learning algorithms are increasingly used for the prediction of algal bloom. In this study, XGBoost(XGB), an ensemble machine learning algorithm, was used to develop a model to predict Chl-a in a reservoir. The daily observation of water quality data and climate data was used for the training and testing of the model. In the first step of the study, the input variables were clustered into two groups(low and high value groups) based on the observed value of water temperature(TEMP), total organic carbon concentration(TOC), total nitrogen concentration(TN) and total phosphorus concentration(TP). For each of the four water quality items, two XGB models were developed using only the data in each clustered group(Model 1). The results were compared to the prediction of an XGB model developed by using the entire data before clustering(Model 2). The model performance was evaluated using three indices including root mean squared error-observation standard deviation ratio(RSR). The model performance was improved using Model 1 for TEMP, TN, TP as the RSR of each model was 0.503, 0.477 and 0.493, respectively, while the RSR of Model 2 was 0.521. On the other hand, Model 2 shows better performance than Model 1 for TOC, where the RSR was 0.532. Explainable artificial intelligence(XAI) is an ongoing field of research in machine learning study. Shapley value analysis, a novel XAI algorithm, was also used for the quantitative interpretation of the XGB model performance developed in this study.

Chest CT Image Patch-Based CNN Classification and Visualization for Predicting Recurrence of Non-Small Cell Lung Cancer Patients (비소세포폐암 환자의 재발 예측을 위한 흉부 CT 영상 패치 기반 CNN 분류 및 시각화)

  • Ma, Serie;Ahn, Gahee;Hong, Helen
    • Journal of the Korea Computer Graphics Society
    • /
    • v.28 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • Non-small cell lung cancer (NSCLC) accounts for a high proportion of 85% among all lung cancer and has a significantly higher mortality rate (22.7%) compared to other cancers. Therefore, it is very important to predict the prognosis after surgery in patients with non-small cell lung cancer. In this study, the types of preoperative chest CT image patches for non-small cell lung cancer patients with tumor as a region of interest are diversified into five types according to tumor-related information, and performance of single classifier model, ensemble classifier model with soft-voting method, and ensemble classifier model using 3 input channels for combination of three different patches using pre-trained ResNet and EfficientNet CNN networks are analyzed through misclassification cases and Grad-CAM visualization. As a result of the experiment, the ResNet152 single model and the EfficientNet-b7 single model trained on the peritumoral patch showed accuracy of 87.93% and 81.03%, respectively. In addition, ResNet152 ensemble model using the image, peritumoral, and shape-focused intratumoral patches which were placed in each input channels showed stable performance with an accuracy of 87.93%. Also, EfficientNet-b7 ensemble classifier model with soft-voting method using the image and peritumoral patches showed accuracy of 84.48%.

An Ensemble Cascading Extremely Randomized Trees Framework for Short-Term Traffic Flow Prediction

  • Zhang, Fan;Bai, Jing;Li, Xiaoyu;Pei, Changxing;Havyarimana, Vincent
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.1975-1988
    • /
    • 2019
  • Short-term traffic flow prediction plays an important role in intelligent transportation systems (ITS) in areas such as transportation management, traffic control and guidance. For short-term traffic flow regression predictions, the main challenge stems from the non-stationary property of traffic flow data. In this paper, we design an ensemble cascading prediction framework based on extremely randomized trees (extra-trees) using a boosting technique called EET to predict the short-term traffic flow under non-stationary environments. Extra-trees is a tree-based ensemble method. It essentially consists of strongly randomizing both the attribute and cut-point choices while splitting a tree node. This mechanism reduces the variance of the model and is, therefore, more suitable for traffic flow regression prediction in non-stationary environments. Moreover, the extra-trees algorithm uses boosting ensemble technique averaging to improve the predictive accuracy and control overfitting. To the best of our knowledge, this is the first time that extra-trees have been used as fundamental building blocks in boosting committee machines. The proposed approach involves predicting 5 min in advance using real-time traffic flow data in the context of inherently considering temporal and spatial correlations. Experiments demonstrate that the proposed method achieves higher accuracy and lower variance and computational complexity when compared to the existing methods.

Comparison of AT1- and Kalman Filter-Based Ensemble Time Scale Algorithms

  • Lee, Ho Seong;Kwon, Taeg Yong;Lee, Young Kyu;Yang, Sung-hoon;Yu, Dai-Hyuk;Park, Sang Eon;Heo, Myoung-Sun
    • Journal of Positioning, Navigation, and Timing
    • /
    • v.10 no.3
    • /
    • pp.197-206
    • /
    • 2021
  • We compared two typical ensemble time scale algorithms; AT1 and Kalman filter. Four commercial atomic clocks composed of two hydrogen masers and two cesium atomic clocks provided measurement data to the algorithms. The allocation of relative weights to the clocks is important to generate a stable ensemble time. A 30 day-average-weight model, which was obtained from the average Allan variance of each clock, was applied to the AT1 algorithm. For the reduced Kalman filter (Kred) algorithm, we gave the same weights to the two hydrogen masers. We also compared the frequency stabilities of the outcome from the algorithms when the frequency offsets and/or the frequency drift offsets estimated by the algorithms were corrected or not corrected by the KRISS-made primary frequency standard, KRISS-F1. We found that the Kred algorithm is more effective to generate a stable ensemble time scale in the long-term, and the algorithm also generates much enhanced short-term stability when the frequency offset is used for the calculation of the Allan deviation instead of the phase offset.