Search | Korea Science

Enhancing prediction accuracy of concrete compressive strength using stacking ensemble machine learning

Yunpeng Zhao;Dimitrios Goulias;Setare Saremi
- Computers and Concrete
- /
- v.32 no.3
- /
- pp.233-246
- /
- 2023
Accurate prediction of concrete compressive strength can minimize the need for extensive, time-consuming, and costly mixture optimization testing and analysis. This study attempts to enhance the prediction accuracy of compressive strength using stacking ensemble machine learning (ML) with feature engineering techniques. Seven alternative ML models of increasing complexity were implemented and compared, including linear regression, SVM, decision tree, multiple layer perceptron, random forest, Xgboost and Adaboost. To further improve the prediction accuracy, a ML pipeline was proposed in which the feature engineering technique was implemented, and a two-layer stacked model was developed. The k-fold cross-validation approach was employed to optimize model parameters and train the stacked model. The stacked model showed superior performance in predicting concrete compressive strength with a correlation of determination (R²) of 0.985. Feature (i.e., variable) importance was determined to demonstrate how useful the synthetic features are in prediction and provide better interpretability of the data and the model. The methodology in this study promotes a more thorough assessment of alternative ML algorithms and rather than focusing on any single ML model type for concrete compressive strength prediction.
https://doi.org/10.12989/cac.2023.32.3.233 인용

Design optimization of a nuclear main steam safety valve based on an E-AHF ensemble surrogate model

Chaoyong Zong;Maolin Shi;Qingye Li;Fuwen Liu;Weihao Zhou;Xueguan Song
- Nuclear Engineering and Technology
- /
- v.54 no.11
- /
- pp.4181-4194
- /
- 2022
Main steam safety valves are commonly used in nuclear power plants to provide final protections from overpressure events. Blowdown and dynamic stability are two critical characteristics of safety valves. However, due to the parameter sensitivity and multi-parameter features of safety valves, using traditional method to design and/or optimize them is generally difficult and/or inefficient. To overcome these problems, a surrogate model-based valve design optimization is carried out in this study, of particular interest are methods of valve surrogate modeling, valve parameters global sensitivity analysis and valve performance optimization. To construct the surrogate model, Design of Experiments (DoE) and Computational Fluid Dynamics (CFD) simulations of the safety valve were performed successively, thereby an ensemble surrogate model (E-AHF) was built for valve blowdown and stability predictions. With the developed E-AHF model, global sensitivity analysis (GSA) on the valve parameters was performed, thereby five primary parameters that affect valve performance were identified. Finally, the k-sigma method is used to conduct the robust optimization on the valve. After optimization, the valve remains stable, the minimum blowdown of the safety valve is reduced greatly from 13.30% to 2.70%, and the corresponding variance is reduced from 1.04 to 0.65 as well, confirming the feasibility and effectiveness of the optimization method proposed in this paper.
https://doi.org/10.1016/j.net.2022.06.019 인용 PDF KSCI

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

Min, Sung-Hwan
- Journal of Intelligence and Information Systems
- /
- v.20 no.4
- /
- pp.121-139
- /
- 2014
Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.
https://doi.org/10.13088/jiis.2014.20.4.121 인용 PDF KSCI

Prediction of electricity consumption in A hotel using ensemble learning with temperature (앙상블 학습과 온도 변수를 이용한 A 호텔의 전력소모량 예측)

Kim, Jaehwi;Kim, Jaehee
- The Korean Journal of Applied Statistics
- /
- v.32 no.2
- /
- pp.319-330
- /
- 2019
Forecasting the electricity consumption through analyzing the past electricity consumption a advantageous for energy planing and policy. Machine learning is widely used as a method to predict electricity consumption. Among them, ensemble learning is a method to avoid the overfitting of models and reduce variance to improve prediction accuracy. However, ensemble learning applied to daily data shows the disadvantages of predicting a center value without showing a peak due to the characteristics of ensemble learning. In this study, we overcome the shortcomings of ensemble learning by considering the temperature trend. We compare nine models and propose a model using random forest with the linear trend of temperature.
https://doi.org/10.5351/KJAS.2019.32.2.319 인용 PDF KSCI HTML

Transfer Learning-Based Feature Fusion Model for Classification of Maneuver Weapon Systems

Jinyong Hwang;You-Rak Choi;Tae-Jin Park;Ji-Hoon Bae
- Journal of Information Processing Systems
- /
- v.19 no.5
- /
- pp.673-687
- /
- 2023
Convolutional neural network-based deep learning technology is the most commonly used in image identification, but it requires large-scale data for training. Therefore, application in specific fields in which data acquisition is limited, such as in the military, may be challenging. In particular, the identification of ground weapon systems is a very important mission, and high identification accuracy is required. Accordingly, various studies have been conducted to achieve high performance using small-scale data. Among them, the ensemble method, which achieves excellent performance through the prediction average of the pre-trained models, is the most representative method; however, it requires considerable time and effort to find the optimal combination of ensemble models. In addition, there is a performance limitation in the prediction results obtained by using an ensemble method. Furthermore, it is difficult to obtain the ensemble effect using models with imbalanced classification accuracies. In this paper, we propose a transfer learning-based feature fusion technique for heterogeneous models that extracts and fuses features of pre-trained heterogeneous models and finally, fine-tunes hyperparameters of the fully connected layer to improve the classification accuracy. The experimental results of this study indicate that it is possible to overcome the limitations of the existing ensemble methods by improving the classification accuracy through feature fusion between heterogeneous models based on transfer learning.
https://doi.org/10.3745/JIPS.04.0291 인용 PDF

Long-term Forecast of Seasonal Precipitation in Korea using the Large-scale Predictors (광역규모 예측인자를 이용한 한반도 계절 강수량의 장기 예측)

Kim, Hwa-Su;Kwak, Chong-Heum;So, Seon-Sup;Suh, Myoung-Seok;Park, Chung-Kyu;Kim, Maeng-Ki
- Journal of the Korean earth science society
- /
- v.23 no.7
- /
- pp.587-596
- /
- 2002
A super ensemble model was developed for the seasonal prediction of regional precipitation in Korea using the lag correlated large scale predictors, based on the empirical orthogonal function (EOF) analysis and multiple linear regression model. The predictability of this model was also evaluated by cross-validation. Correlation between the predicted and the observed value obtained from the super ensemble model showed 0.73 in spring, 0.61 in summer, 0.69 in autumn and 0.75 in winter. The predictability of categorical forecasting was also evaluated based on the three classes such as above normal, near normal and below normal that are clearly defined in terms of a priori specified by threshold values. Categorical forecasting by the super ensemble model has a hit rate with a range from 0.42 to 0.74 in seasonal precipitation.
PDF KSCI

Development of an Ensemble Prediction Model for Lateral Deformation of Retaining Wall Under Construction (시공 중 흙막이 벽체 수평변위 예측을 위한 앙상블 모델 개발)

Seo, Seunghwan;Chung, Moonkyung
- Journal of the Korean Geotechnical Society
- /
- v.39 no.4
- /
- pp.5-17
- /
- 2023
The advancement in large-scale underground excavation in urban areas necessitates monitoring and predicting technologies that can pre-emptively mitigate risk factors at construction sites. Traditionally, two methods predict the deformation of retaining walls induced by excavation: empirical and numerical analysis. Recent progress in artificial intelligence technology has led to the development of a predictive model using machine learning techniques. This study developed a model for predicting the deformation of a retaining wall under construction using a boosting-based algorithm and an ensemble model with outstanding predictive power and efficiency. A database was established using the data from the design-construction-maintenance process of the underground retaining wall project in a manifold manner. Based on these data, a learning model was created, and the performance was evaluated. The boosting and ensemble models demonstrated that wall deformation could be accurately predicted. In addition, it was confirmed that prediction results with the characteristics of the actual construction process can be presented using data collected from ground measurements. The predictive model developed in this study is expected to be used to evaluate and monitor the stability of retaining walls under construction.
https://doi.org/10.7843/kgs.2023.39.4.5 인용 PDF

Development of Highway Traffic Information Prediction Models Using the Stacking Ensemble Technique Based on Cross-validation (스태킹 앙상블 기법을 활용한 고속도로 교통정보 예측모델 개발 및 교차검증에 따른 성능 비교)

Yoseph Lee;Seok Jin Oh;Yejin Kim;Sung-ho Park;Ilsoo Yun
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.22 no.6
- /
- pp.1-16
- /
- 2023
Accurate traffic information prediction is considered to be one of the most important aspects of intelligent transport systems(ITS), as it can be used to guide users of transportation facilities to avoid congested routes. Various deep learning models have been developed for accurate traffic prediction. Recently, ensemble techniques have been utilized to combine the strengths and weaknesses of various models in various ways to improve prediction accuracy and stability. Therefore, in this study, we developed and evaluated a traffic information prediction model using various deep learning models, and evaluated the performance of the developed deep learning models as a stacking ensemble. The individual models showed error rates within 10% for traffic volume prediction and 3% for speed prediction. The ensemble model showed higher accuracy compared to other models when no cross-validation was performed, and when cross-validation was performed, it showed a uniform error rate in long-term forecasting.
https://doi.org/10.12815/kits.2023.22.6.1 인용 PDF

A Structural Design Method Using Ensemble Model of RSM and Kriging (반응표면법과 크리깅의 혼합모델을 이용한 구조설계방법)

Kim, Nam-Hee;Lee, Kwon-Hee
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.16 no.3
- /
- pp.1630-1638
- /
- 2015
The finite element analysis has become an essential process to investigate the structural performance in many industry fields. In addition, the computer's performance is improving rapidly, but in large design problems, there is a limit to apply the optimal design techniques. For this, it is general to introduce a metamodel based optimization technique. The method to generate an approximate model can be classified into curve fitting and interpolation, and each representative one is response surface model and kriging interpolation method. This study proposes an ensemble model made of RSM and kriging to solve a structural design problem. The suggested method is applied to the designs of two bar and automobile outer tie rod.
https://doi.org/10.5762/KAIS.2015.16.3.1630 인용 PDF KSCI

Comparative analysis of model performance for predicting the customer of cafeteria using unstructured data

Seungsik Kim;Nami Gu;Jeongin Moon;Keunwook Kim;Yeongeun Hwang;Kyeongjun Lee
- Communications for Statistical Applications and Methods
- /
- v.30 no.5
- /
- pp.485-499
- /
- 2023
This study aimed to predict the number of meals served in a group cafeteria using machine learning methodology. Features of the menu were created through the Word2Vec methodology and clustering, and a stacking ensemble model was constructed using Random Forest, Gradient Boosting, and CatBoost as sub-models. Results showed that CatBoost had the best performance with the ensemble model showing an 8% improvement in performance. The study also found that the date variable had the greatest influence on the number of diners in a cafeteria, followed by menu characteristics and other variables. The implications of the study include the potential for machine learning methodology to improve predictive performance and reduce food waste, as well as the removal of subjective elements in menu classification. Limitations of the research include limited data cases and a weak model structure when new menus or foreign words are not included in the learning data. Future studies should aim to address these limitations.
https://doi.org/10.29220/CSAM.2023.30.5.485 인용 PDF

Search Result 652, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)