• Title/Summary/Keyword: RMSE(Root Mean Squared Error)

Search Result 141, Processing Time 0.031 seconds

Hourly Steel Industry Energy Consumption Prediction Using Machine Learning Algorithms

  • Sathishkumar, VE;Lee, Myeong-Bae;Lim, Jong-Hyun;Shin, Chang-Sun;Park, Chang-Woo;Cho, Yong Yun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.585-588
    • /
    • 2019
  • Predictions of Energy Consumption for Industries gain an important place in energy management and control system, as there are dynamic and seasonal changes in the demand and supply of energy. This paper presents and discusses the predictive models for energy consumption of the steel industry. Data used includes lagging and leading current reactive power, lagging and leading current power factor, carbon dioxide (tCO2) emission and load type. In the test set, four statistical models are trained and evaluated: (a) Linear regression (LR), (b) Support Vector Machine with radial kernel (SVM RBF), (c) Gradient Boosting Machine (GBM), (d) random forest (RF). Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are used to measure the prediction efficiency of regression designs. When using all the predictors, the best model RF can provide RMSE value 7.33 in the test set.

Development of a Transfer Function Model to Forecast Ground-level Ozone Concentration in Seoul (서울지역의 지표오존농도 예보를 위한 전이함수모델 개발)

  • 김유근;손건태;문윤섭;오인보
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.15 no.6
    • /
    • pp.779-789
    • /
    • 1999
  • To support daily ground-level $O_3$ forecasting in Seoul, a transfer function model(TFM) has been developed by using surface meteorological data and pollutant data(previous-day [$O_3$] and [$NO_2$]) from 1 May to 31 August in 1997. The forecast performance of the TFM was evaluated by statistical comparison with $O_3$ concentration observed during September it is shown that correlation coefficient(R), root mean squared error(RMSE), normalized mean squared error(NMSE) and mean relative error(MRE) were 0.73, 15.64, 0.006 and 0.101, respectively. The TFM appeared to have some difficulty forecasting very high $O_3$ concentrations. To compare with this model, multiple regression model(MRM) was developed for the same period. According to statistical comparison between the TFM and MRM. two models had similar predictive capability but TFM based on $O_3$ concentration higher than 60 ppb provided more accurate forecast than MRM. It was concluded that statistical model based on TFM can be useful for improving the accuracy of local $O_3$ forecast.

  • PDF

Implementation on the evolutionary machine learning approaches for streamflow forecasting: case study in the Seybous River, Algeria (유출예측을 위한 진화적 기계학습 접근법의 구현: 알제리 세이보스 하천의 사례연구)

  • Zakhrouf, Mousaab;Bouchelkia, Hamid;Stamboul, Madani;Kim, Sungwon;Singh, Vijay P.
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.6
    • /
    • pp.395-408
    • /
    • 2020
  • This paper aims to develop and apply three different machine learning approaches (i.e., artificial neural networks (ANN), adaptive neuro-fuzzy inference systems (ANFIS), and wavelet-based neural networks (WNN)) combined with an evolutionary optimization algorithm and the k-fold cross validation for multi-step (days) streamflow forecasting at the catchment located in Algeria, North Africa. The ANN and ANFIS models yielded similar performances, based on four different statistical indices (i.e., root mean squared error (RMSE), Nash-Sutcliffe efficiency (NSE), correlation coefficient (R), and peak flow criteria (PFC)) for training and testing phases. The values of RMSE and PFC for the WNN model (e.g., RMSE = 8.590 ㎥/sec, PFC = 0.252 for (t+1) day, testing phase) were lower than those of ANN (e.g., RMSE = 19.120 ㎥/sec, PFC = 0.446 for (t+1) day, testing phase) and ANFIS (e.g., RMSE = 18.520 ㎥/sec, PFC = 0.444 for (t+1) day, testing phase) models, while the values of NSE and R for WNN model were higher than those of ANNs and ANFIS models. Therefore, the new approach can be a robust tool for multi-step (days) streamflow forecasting in the Seybous River, Algeria.

Combined effect of glass and carbon fiber in asphalt concrete mix using computing techniques

  • Upadhya, Ankita;Thakur, M.S.;Sharma, Nitisha;Almohammed, Fadi H.;Sihag, Parveen
    • Advances in Computational Design
    • /
    • v.7 no.3
    • /
    • pp.253-279
    • /
    • 2022
  • This study investigated and predicted the Marshall stability of glass-fiber asphalt mix, carbon-fiber asphalt mix and glass-carbon-fiber asphalt (hybrid) mix by using machine learning techniques such as Artificial Neural Network (ANN), Support Vector Machine (SVM) and Random Forest(RF), The data was obtained from the experiments and the research articles. Assessment of results indicated that performance of the Artificial Neural Network (ANN) based model outperformed applied models in training and testing datasets with values of indices as; coefficient of correlation (CC) 0.8492 and 0.8234, mean absolute error (MAE) 2.0999 and 2.5408, root mean squared error (RMSE) 2.8541 and 3.3165, relative absolute error (RAE) 48.16% and 54.05%, relative squared error (RRSE) 53.14% and 57.39%, Willmott's index (WI) 0.7490 and 0.7011, Scattering index (SI) 0.4134 and 0.3702 and BIAS 0.3020 and 0.4300 for both training and testing stages respectively. The Taylor diagram also confirms that the ANN-based model outperforms the other models. Results of sensitivity analysis show that Carbon fiber has a major influence in predicting the Marshall stability. However, the carbon fiber (CF) followed by glass-carbon fiber (50GF:50CF) and the optimal combination CF + (50GF:50CF) are found to be most sensitive in predicting the Marshall stability of fibrous asphalt concrete.

Applicability Evaluation of Automated Machine Learning and Deep Neural Networks for Arctic Sea Ice Surface Temperature Estimation (북극 해빙표면온도 산출을 위한 Automated Machine Learning과 Deep Neural Network의 적용성 평가)

  • Sungwoo Park;Noh-Hun Seong;Suyoung Sim;Daeseong Jung;Jongho Woo;Nayeon Kim;Honghee Kim;Kyung-Soo Han
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1491-1495
    • /
    • 2023
  • This study utilized automated machine learning (AutoML) to calculate Arctic ice surface temperature (IST). AutoML-derived IST exhibited a strong correlation coefficient (R) of 0.97 and a root mean squared error (RMSE) of 2.51K. Comparative analysis with deep neural network (DNN) models revealed that AutoML IST demonstrated good accuracy, particularly when compared to Moderate Resolution Imaging Spectroradiometer (MODIS) IST and ice mass balance (IMB) buoy IST. These findings underscore the effectiveness of AutoML in enhancing IST estimation accuracy under challenging polar conditions.

A Predictive Model for the Number of Potholes Using Basic Harmony Search Algorithm (하모니 검색 알고리즘을 이용한 포트홀 발생 개수 예측 모형)

  • Kim, Dowan;Lee, Sangyum;Kim, Dongho
    • Korean Journal of Construction Engineering and Management
    • /
    • v.15 no.4
    • /
    • pp.150-158
    • /
    • 2014
  • A bunch of asphalt roads have been damaged frequently in relation to the rapid climate change. To solve and prevent this type of problems, many nationalities in the world have performed various researches. In this regard, the objective of this study is to develop prediction model as to the number of potholes occurred in seoul. At the same time, we have utilized empirical and statistical approaches in order for us to identify factors which is affecting the actual occurrence. The predictive model was determinded by using BHS (Basic Harmony Search) algorithm. Prediction was based on the weather and traffic data as well as data occurrence data of porthole. To assess the influences which are PAR(Pitch Adjusting Rate) and HMCR(Harmony Memory Considering Rate), we determined suitability by changing the values. In the process of the determining a predictive model, the predictive model composed Training data (2011, 2012 and 2013yrs data). To determine the suitability of the model, we have utilized Testing Set (2009 and 2010 yrs data). The suitability of the basic prediction model has been from RMSE(Root Mean Squared Error), MAE(Mean Absolute Error) and Coefficient of determination.

Water level forecasting for extended lead times using preprocessed data with variational mode decomposition: A case study in Bangladesh

  • Shabbir Ahmed Osmani;Roya Narimani;Hoyoung Cha;Changhyun Jun;Md Asaduzzaman Sayef
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.179-179
    • /
    • 2023
  • This study suggests a new approach of water level forecasting for extended lead times using original data preprocessing with variational mode decomposition (VMD). Here, two machine learning algorithms including light gradient boosting machine (LGBM) and random forest (RF) were considered to incorporate extended lead times (i.e., 5, 10, 15, 20, 25, 30, 40, and 50 days) forecasting of water levels. At first, the original data at two water level stations (i.e., SW173 and SW269 in Bangladesh) and their decomposed data from VMD were prepared on antecedent lag times to analyze in the datasets of different lead times. Mean absolute error (MAE), root mean squared error (RMSE), and mean squared error (MSE) were used to evaluate the performance of the machine learning models in water level forecasting. As results, it represents that the errors were minimized when the decomposed datasets were considered to predict water levels, rather than the use of original data standalone. It was also noted that LGBM produced lower MAE, RMSE, and MSE values than RF, indicating better performance. For instance, at the SW173 station, LGBM outperformed RF in both decomposed and original data with MAE values of 0.511 and 1.566, compared to RF's MAE values of 0.719 and 1.644, respectively, in a 30-day lead time. The models' performance decreased with increasing lead time, as per the study findings. In summary, preprocessing original data and utilizing machine learning models with decomposed techniques have shown promising results for water level forecasting in higher lead times. It is expected that the approach of this study can assist water management authorities in taking precautionary measures based on forecasted water levels, which is crucial for sustainable water resource utilization.

  • PDF

Optimization of Soil Contamination Distribution Prediction Error using Geostatistical Technique and Interpretation of Contributory Factor Based on Machine Learning Algorithm (지구통계 기법을 이용한 토양오염 분포 예측 오차 최적화 및 머신러닝 알고리즘 기반의 영향인자 해석)

  • Hosang Han;Jangwon Suh;Yosoon Choi
    • Economic and Environmental Geology
    • /
    • v.56 no.3
    • /
    • pp.331-341
    • /
    • 2023
  • When creating a soil contamination map using geostatistical techniques, there are various sources that can affect prediction errors. In this study, a grid-based soil contamination map was created from the sampling data of heavy metal concentrations in soil in abandoned mine areas using Ordinary Kriging. Five factors that were judged to affect the prediction error of the soil contamination map were selected, and the variation of the root mean squared error (RMSE) between the predicted value and the actual value was analyzed based on the Leave-one-out technique. Then, using a machine learning algorithm, derived the top three factors affecting the RMSE. As a result, it was analyzed that Variogram Model, Minimum Neighbors, and Anisotropy factors have the largest impact on RMSE in the Standard interpolation. For the variogram models, the Spherical model showed the lowest RMSE, while the Minimum Neighbors had the lowest value at 3 and then increased as the value increased. In the case of Anisotropy, it was found to be more appropriate not to consider anisotropy. In this study, through the combined use of geostatistics and machine learning, it was possible to create a highly reliable soil contamination map at the local scale, and to identify which factors have a significant impact when interpolating a small amount of soil heavy metal data.

An Energy Consumption Prediction Model for Smart Factory Using Data Mining Algorithms (데이터 마이닝 기반 스마트 공장 에너지 소모 예측 모델)

  • Sathishkumar, VE;Lee, Myeongbae;Lim, Jonghyun;Kim, Yubin;Shin, Changsun;Park, Jangwoo;Cho, Yongyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.5
    • /
    • pp.153-160
    • /
    • 2020
  • Energy Consumption Predictions for Industries has a prominent role to play in the energy management and control system as dynamic and seasonal changes are occurring in energy demand and supply. This paper introduces and explores the steel industry's predictive models of energy consumption. The data used includes lagging and leading reactive power lagging and leading current variable, emission of carbon dioxide (tCO2) and load type. Four statistical models are trained and tested in the test set: (a) Linear Regression (LR), (b) Radial Kernel Support Vector Machine (SVM RBF), (c) Gradient Boosting Machine (GBM), and (d) Random Forest (RF). Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are used for calculating regression model predictive performance. When using all the predictors, the best model RF can provide RMSE value 7.33 in the test set.

Estimation of exponent value for Pythagorean method in Korean pro-baseball (한국프로야구에서 피타고라스 지수의 추정)

  • Lee, Jang Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.3
    • /
    • pp.493-499
    • /
    • 2014
  • The Pythagorean won-loss formula postulated by James (1980) indicates the percentage of games as a function of runs scored and runs allowed. Several hundred articles have explored variations which improve RMSE by original formula and their fit to empirical data. This paper considers a variation on the formula which allows for variation of the Pythagorean exponent. We provide the most suitable optimal exponent in the Pythagorean method. We compare it with other methods, such as the Pythagenport by Davenport and Woolner, and the Pythagenpat by Smyth and Patriot. Finally, our results suggest that proposed method is superior to other tractable alternatives under criterion of RMSE.