• Title/Summary/Keyword: ensemble prediction

Search Result 373, Processing Time 0.022 seconds

Prediction of Residual Resistance Coefficient of Low-Speed Full Ships Using Hull Form Variables and Machine Learning Approaches (선형변수 기계학습 기법을 활용한 저속비대선의 잉여저항계수 추정)

  • Kim, Yoo-Chul;Yang, Kyung-Kyu;Kim, Myung-Soo;Lee, Young-Yeon;Kim, Kwang-Soo
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.57 no.6
    • /
    • pp.312-321
    • /
    • 2020
  • In this study, machine learning techniques were applied to predict the residual resistance coefficient (Cr) of low-speed full ships. The used machine learning methods are Ridge regression, support vector regression, random forest, neural network and their ensemble model. 19 hull form variables were used as input variables for machine learning methods. The hull form variables and Cr data obtained from 139 hull forms of KRISO database were used in analysis. 80 % of the total data were used as training models and the rest as validation. Some non-linear models showed the overfitted results and the ensemble model showed better results than others.

Relevancy contemplation in medical data analytics and ranking of feature selection algorithms

  • P. Antony Seba;J. V. Bibal Benifa
    • ETRI Journal
    • /
    • v.45 no.3
    • /
    • pp.448-461
    • /
    • 2023
  • This article performs a detailed data scrutiny on a chronic kidney disease (CKD) dataset to select efficient instances and relevant features. Data relevancy is investigated using feature extraction, hybrid outlier detection, and handling of missing values. Data instances that do not influence the target are removed using data envelopment analysis to enable reduction of rows. Column reduction is achieved by ranking the attributes through feature selection methodologies, namely, extra-trees classifier, recursive feature elimination, chi-squared test, analysis of variance, and mutual information. These methodologies are ranked via Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) using weight optimization to identify the optimal features for model building from the CKD dataset to facilitate better prediction while diagnosing the severity of the disease. An efficient hybrid ensemble and novel similarity-based classifiers are built using the pruned dataset, and the results are thereafter compared with random forest, AdaBoost, naive Bayes, k-nearest neighbors, and support vector machines. The hybrid ensemble classifier yields a better prediction accuracy of 98.31% for the features selected by extra tree classifier (ETC), which is ranked as the best by TOPSIS.

Improving streamflow prediction with assimilating the SMAP soil moisture data in WRF-Hydro

  • Kim, Yeri;Kim, Yeonjoo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.205-205
    • /
    • 2021
  • Surface soil moisture, which governs the partitioning of precipitation into infiltration and runoff, plays an important role in the hydrological cycle. The assimilation of satellite soil moisture retrievals into a land surface model or hydrological model has been shown to improve the predictive skill of hydrological variables. This study aims to improve streamflow prediction with Weather Research and Forecasting model-Hydrological modeling system (WRF-Hydro) by assimilating Soil Moisture Active and Passive (SMAP) data at 3 km and analyze its impacts on hydrological components. We applied Cumulative Distribution Function (CDF) technique to remove the bias of SMAP data and assimilate SMAP data (April to July 2015-2019) into WRF-Hydro by using an Ensemble Kalman Filter (EnKF) with a total 12 ensembles. Daily inflow and soil moisture estimates of major dams (Soyanggang, Chungju, Sumjin dam) of South Korea were evaluated. We investigated how hydrologic variables such as runoff, evaporation and soil moisture were better simulated with the data assimilation than without the data assimilation. The result shows that the correlation coefficient of topsoil moisture can be improved, however a change of dam inflow was not outstanding. It may attribute to the fact that soil moisture memory and the respective memory of runoff play on different time scales. These findings demonstrate that the assimilation of satellite soil moisture retrievals can improve the predictive skill of hydrological variables for a better understanding of the water cycle.

  • PDF

Optimization of the Vertical Localization Scale for GPS-RO Data Assimilation within KIAPS-LETKF System (KIAPS 앙상블 자료동화 시스템을 이용한 GPS 차폐자료 연직 국지화 규모 최적화)

  • Jo, Youngsoon;Kang, Ji-Sun;Kwon, Hataek
    • Atmosphere
    • /
    • v.25 no.3
    • /
    • pp.529-541
    • /
    • 2015
  • Korea Institute of Atmospheric Prediction System (KIAPS) has been developing a global numerial prediction model and data assimilation system. We has implemented LETKF (Local Ensemble Transform Kalman Filter, Hunt et al., 2007) data assimilation system to NCAR CAM-SE (National Center for Atmospheric Research Community Atmosphere Model with Spectral Element dynamical core, Dennis et al., 2012) that has cubed-sphere grid, known as the same grid system of KIAPS Integrated Model (KIM) now developing. In this study, we have assimilated Global Positioning System Radio Occultation (GPS-RO) bending angle measurements in addition to conventional data within ensemble-based data assimilation system. Before assimilating bending angle data, we performed a vertical unit conversion. The information of vertical localization for GPS-RO data is given by the unit of meter, but the vertical localization method in the LETKF system is based on pressure unit. Therefore, with a clever conversion of the vertical information, we have conducted experiments to search for the best vertical localization scale on GPS-RO data under the Observing System Simulation Experiments (OSSEs). As a result, we found the optimal setting of vertical localization for the GPS-RO bending angle data assimilation. We plan to apply the selected localization strategy to the LETKF system implemented to KIM which is expected to give better analysis of GPS-RO data assimilation due to much higher model top.

Inter-comparison of Prediction Skills of Multiple Linear Regression Methods Using Monthly Temperature Simulated by Multi-Regional Climate Models (다중 지역기후모델로부터 모의된 월 기온자료를 이용한 다중선형회귀모형들의 예측성능 비교)

  • Seong, Min-Gyu;Kim, Chansoo;Suh, Myoung-Seok
    • Atmosphere
    • /
    • v.25 no.4
    • /
    • pp.669-683
    • /
    • 2015
  • In this study, we investigated the prediction skills of four multiple linear regression methods for monthly air temperature over South Korea. We used simulation results from four regional climate models (RegCM4, SNURCM, WRF, and YSURSM) driven by two boundary conditions (NCEP/DOE Reanalysis 2 and ERA-Interim). We selected 15 years (1989~2003) as the training period and the last 5 years (2004~2008) as validation period. The four regression methods used in this study are as follows: 1) Homogeneous Multiple linear Regression (HMR), 2) Homogeneous Multiple linear Regression constraining the regression coefficients to be nonnegative (HMR+), 3) non-homogeneous multiple linear regression (EMOS; Ensemble Model Output Statistics), 4) EMOS with positive coefficients (EMOS+). It is same method as the third method except for constraining the coefficients to be nonnegative. The four regression methods showed similar prediction skills for the monthly air temperature over South Korea. However, the prediction skills of regression methods which don't constrain regression coefficients to be nonnegative are clearly impacted by the existence of outliers. Among the four multiple linear regression methods, HMR+ and EMOS+ methods showed the best skill during the validation period. HMR+ and EMOS+ methods showed a very similar performance in terms of the MAE and RMSE. Therefore, we recommend the HMR+ as the best method because of ease of development and applications.

Reliability Assessment of Temperature and Precipitation Seasonal Probability in Current Climate Prediction Systems (현 기후예측시스템에서의 기온과 강수 계절 확률 예측 신뢰도 평가)

  • Hyun, Yu-Kyung;Park, Jinkyung;Lee, Johan;Lim, Somin;Heo, Sol-Ip;Ham, Hyunjun;Lee, Sang-Min;Ji, Hee-Sook;Kim, Yoonjae
    • Atmosphere
    • /
    • v.30 no.2
    • /
    • pp.141-154
    • /
    • 2020
  • Seasonal forecast is growing in demand, as it provides valuable information for decision making and potential to reduce impact on weather events. This study examines how operational climate prediction systems can be reliable, producing the probability forecast in seasonal scale. A reliability diagram was used, which is a tool for the reliability by comparing probabilities with the corresponding observed frequency. It is proposed for a method grading scales of 1-5 based on the reliability diagram to quantify the reliability. Probabilities are derived from ensemble members using hindcast data. The analysis is focused on skill for 2 m temperature and precipitation from climate prediction systems in KMA, UKMO, and ECMWF, NCEP and JMA. Five categorizations are found depending on variables, seasons and regions. The probability forecast for 2 m temperature can be relied on while that for precipitation is reliable only in few regions. The probabilistic skill in KMA and UKMO is comparable with ECMWF, and the reliabilities tend to increase as the ensemble size and hindcast period increasing.

Improvement of Soil Moisture Initialization for a Global Seasonal Forecast System (전지구 계절 예측 시스템의 토양수분 초기화 방법 개선)

  • Seo, Eunkyo;Lee, Myong-In;Jeong, Jee-Hoon;Kang, Hyun-Suk;Won, Duk-Jin
    • Atmosphere
    • /
    • v.26 no.1
    • /
    • pp.35-45
    • /
    • 2016
  • Initialization of the global seasonal forecast system is as much important as the quality of the embedded climate model for the climate prediction in sub-seasonal time scale. Recent studies have emphasized the important role of soil moisture initialization, suggesting a significant increase in the prediction skill particularly in the mid-latitude land area where the influence of sea surface temperature in the tropics is less crucial and the potential predictability is supplemented by land-atmosphere interaction. This study developed a new soil moisture initialization method applicable to the KMA operational seasonal forecasting system. The method includes first the long-term integration of the offline land surface model driven by observed atmospheric forcing and precipitation. This soil moisture reanalysis is given for the initial state in the ensemble seasonal forecasts through a simple anomaly initialization technique to avoid the simulation drift caused by the systematic model bias. To evaluate the impact of the soil moisture initialization, two sets of long-term, 10-member ensemble experiment runs have been conducted for 1996~2009. As a result, the soil moisture initialization improves the prediction skill of surface air temperature significantly at the zero to one month forecast lead (up to ~60 days forecast lead), although the skill increase in precipitation is less significant. This study suggests that improvements of the prediction in the sub-seasonal timescale require the improvement in the quality of initial data as well as the adequate treatment of the model systematic bias.

Validations of Typhoon Intensity Guidance Models in the Western North Pacific (북서태평양 태풍 강도 가이던스 모델 성능평가)

  • Oh, You-Jung;Moon, Il-Ju;Kim, Sung-Hun;Lee, Woojeong;Kang, KiRyong
    • Atmosphere
    • /
    • v.26 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • Eleven Tropical Cyclone (TC) intensity guidance models in the western North Pacific have been validated over 2008~2014 based on various analysis methods according to the lead time of forecast, year, month, intensity, rapid intensity change, track, and geographical area with an additional focus on TCs that influenced the Korean peninsula. From the evaluation using mean absolute error and correlation coefficients for maximum wind speed forecasts up to 72 h, we found that the Hurricane Weather Research and Forecasting model (HWRF) outperforms all others overall although the Global Forecast System (GFS), the Typhoon Ensemble Prediction System of Japan Meteorological Agency (TEPS), and the Korean version of Weather and Weather Research and Forecasting model (KWRF) also shows a good performance in some lead times of forecast. In particular, HWRF shows the highest performance in predicting the intensity of strong TCs above Category 3, which may be attributed to its highest spatial resolution (~3 km). The Navy Operational Global Prediction Model (NOGAPS) and GFS were the most improved model during 2008~2014. For initial intensity error, two Japanese models, Japan Meteorological Agency Global Spectral Model (JGSM) and TEPS, had the smallest error. In track forecast, the European Centre for Medium-Range Weather Forecasts (ECMWF) and recent GFS model outperformed others. The present results has significant implications for providing basic information for operational forecasters as well as developing ensemble or consensus prediction systems.

Implementation of Spatial Downscaling Method Based on Gradient and Inverse Distance Squared (GIDS) for High-Resolution Numerical Weather Prediction Data (고해상도 수치예측자료 생산을 위한 경도-역거리 제곱법(GIDS) 기반의 공간 규모 상세화 기법 활용)

  • Yang, Ah-Ryeon;Oh, Su-Bin;Kim, Joowan;Lee, Seung-Woo;Kim, Chun-Ji;Park, Soohyun
    • Atmosphere
    • /
    • v.31 no.2
    • /
    • pp.185-198
    • /
    • 2021
  • In this study, we examined a spatial downscaling method based on Gradient and Inverse Distance Squared (GIDS) weighting to produce high-resolution grid data from a numerical weather prediction model over Korean Peninsula with complex terrain. The GIDS is a simple and effective geostatistical downscaling method using horizontal distance gradients and an elevation. The predicted meteorological variables (e.g., temperature and 3-hr accumulated rainfall amount) from the Limited-area ENsemble prediction System (LENS; horizontal grid spacing of 3 km) are used for the GIDS to produce a higher horizontal resolution (1.5 km) data set. The obtained results were compared to those from the bilinear interpolation. The GIDS effectively produced high-resolution gridded data for temperature with the continuous spatial distribution and high dependence on topography. The results showed a better agreement with the observation by increasing a searching radius from 10 to 30 km. However, the GIDS showed relatively lower performance for the precipitation variable. Although the GIDS has a significant efficiency in producing a higher resolution gridded temperature data, it requires further study to be applied for rainfall events.

A Modeling of Realtime Fuel Comsumption Prediction Using OBDII Data (OBDII 데이터 기반의 실시간 연료 소비량 예측 모델 연구)

  • Yang, Hee-Eun;Kim, Do-Hyun;Choe, Hoseop
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.2
    • /
    • pp.57-64
    • /
    • 2021
  • This study presents a method for realtime fuel consumption prediction using real data collected from OBDII. With the advent of the era of self-driving cars, electronic control units(ECU) are getting more complex, and various studies are being attempted to extract and analyze more accurate data from vehicles. But since ECU is getting more complex, it is getting harder to get the data from ECU. To solve this problem, the firmware was developed for acquiring accurate vehicle data in this study, which extracted 53,580 actual driving data sets from vehicles from January to February 2019. Using these data, the ensemble stacking technique was used to increase the accuracy of the realtime fuel consumption prediction model. In this study, Ridge, Lasso, XGBoost, and LightGBM were used as base models, and Ridge was used for meta model, and the predicted performance was MAE 0.011, RMSE 0.017.