• Title/Summary/Keyword: ensemble technique

Search Result 216, Processing Time 0.027 seconds

Prediction of compressive strength of sustainable concrete using machine learning tools

  • Lokesh Choudhary;Vaishali Sahu;Archanaa Dongre;Aman Garg
    • Computers and Concrete
    • /
    • v.33 no.2
    • /
    • pp.137-145
    • /
    • 2024
  • The technique of experimentally determining concrete's compressive strength for a given mix design is time-consuming and difficult. The goal of the current work is to propose a best working predictive model based on different machine learning algorithms such as Gradient Boosting Machine (GBM), Stacked Ensemble (SE), Distributed Random Forest (DRF), Extremely Randomized Trees (XRT), Generalized Linear Model (GLM), and Deep Learning (DL) that can forecast the compressive strength of ternary geopolymer concrete mix without carrying out any experimental procedure. A geopolymer mix uses supplementary cementitious materials obtained as industrial by-products instead of cement. The input variables used for assessing the best machine learning algorithm not only include individual ingredient quantities, but molarity of the alkali activator and age of testing as well. Myriad statistical parameters used to measure the effectiveness of the models in forecasting the compressive strength of ternary geopolymer concrete mix, it has been found that GBM performs better than all other algorithms. A sensitivity analysis carried out towards the end of the study suggests that GBM model predicts results close to the experimental conditions with an accuracy between 95.6 % to 98.2 % for testing and training datasets.

Machine learning application to seismic site classification prediction model using Horizontal-to-Vertical Spectral Ratio (HVSR) of strong-ground motions

  • Francis G. Phi;Bumsu Cho;Jungeun Kim;Hyungik Cho;Yun Wook Choo;Dookie Kim;Inhi Kim
    • Geomechanics and Engineering
    • /
    • v.37 no.6
    • /
    • pp.539-554
    • /
    • 2024
  • This study explores development of prediction model for seismic site classification through the integration of machine learning techniques with horizontal-to-vertical spectral ratio (HVSR) methodologies. To improve model accuracy, the research employs outlier detection methods and, synthetic minority over-sampling technique (SMOTE) for data balance, and evaluates using seven machine learning models using seismic data from KiK-net. Notably, light gradient boosting method (LGBM), gradient boosting, and decision tree models exhibit improved performance when coupled with SMOTE, while Multiple linear regression (MLR) and Support vector machine (SVM) models show reduced efficacy. Outlier detection techniques significantly enhance accuracy, particularly for LGBM, gradient boosting, and voting boosting. The ensemble of LGBM with the isolation forest and SMOTE achieves the highest accuracy of 0.91, with LGBM and local outlier factor yielding the highest F1-score of 0.79. Consistently outperforming other models, LGBM proves most efficient for seismic site classification when supported by appropriate preprocessing procedures. These findings show the significance of outlier detection and data balancing for precise seismic soil classification prediction, offering insights and highlighting the potential of machine learning in optimizing site classification accuracy.

PHOTOMETRIC STUDY OF NPA ROTATOR (5247) KRYLOV

  • Lee, Hee-Jae;Moon, Hong-Kyu;Kim, Myung-Jin;Kim, Chun-Hwey;Durech, Josef;Choi, Young-Jun;Oh, Young-Seok;Park, Jintae;Roh, Dong-Goo;Yim, Hong-Suh;Cha, Sang-Mok;Lee, Yongseok
    • Journal of The Korean Astronomical Society
    • /
    • v.50 no.3
    • /
    • pp.41-49
    • /
    • 2017
  • We conduct BVRI and R band photometric observations of asteroid (5247) Krylov from January 2016 to April 2016 for 51 nights using the Korea Microlensing Telescope Network (KMTNet). The color indices of (5247) Krylov at the light curve maxima are determined as $B-V=0.841{\pm}0.035$, $V-R=0.418{\pm}0.031$, and $V-I=0.871{\pm}0.031$ where the phase angle is $14.1^{\circ}$. They are acquired after the standardization of BVRI instrumental measurements using the ensemble normalization technique. Based on the color indices, (5247) Krylov is classified as a S-type asteroid. Double periods, that is, a primary period $P_1=82.188{\pm}0.013h$ and a secondary period $P_2=67.13{\pm}0.20h$ are identified from period searches of its R band light curve. The light curve phases with $P_1$ and this indicate that it is a typical Non-Principal Axis (NPA) asteroid. We discuss the possible causes of its NPA rotation.

Development of the Selected Multi-model Consensus Technique for the Tropical Cyclone Track Forecast in the Western North Pacific (태풍 진로예측을 위한 다중모델 선택 컨센서스 기법 개발)

  • Jun, Sanghee;Lee, Woojeong;Kang, KiRyong;Yun, Won-Tae
    • Atmosphere
    • /
    • v.25 no.2
    • /
    • pp.375-387
    • /
    • 2015
  • A Selected Multi-model CONsensus (SMCON) technique was developed and verified for the tropical cyclone track forecast in the western North Pacific. The SMCON forecasts were produced by averaging numerical model forecasts showing low 70% latest 6 h prediction errors among 21 models. In the homogeneous comparison for 54 tropical cyclones in 2013 and 2014, the SMCON improvement rate was higher than the other forecasts such as the Non-Selected Multi-model CONsensus (NSMCON) and other numerical models (i.e., GDAPS, GEPS, GFS, HWRF, ECMWF, ECMWF_H, ECMWF_EPS, JGSM, TEPS). However, the SMCON showed lower or similar improvement rate than a few forecasts including ECMWF_EPS forecasts at 96 h in 2013 and at 72 h in 2014 and the TEPS forecast at 120 h in 2013. Mean track errors of the SMCON for two year were smaller than the NSMCON and these differences were 0.4, 1.2, 5.9, 12.9, 8.2 km at 24-, 48-, 72-, 96-, 120-h respectively. The SMCON error distributions showed smaller central tendency than the NSMCON's except 72-, 96-h forecasts in 2013. Similarly, the density for smaller track errors of the SMCON was higher than the NSMCON's except at 72-, 96-h forecast in 2013 in the kernel density estimation analysis. In addition, the NSMCON has lager range of errors above the third quantile and larger standard deviation than the SMCON's at 72-, 96-h forecasts in 2013. Also, the SMCON showed smaller bias than ECMWF_H for the cross track bias. Thus, we concluded that the SMCON could provide more reliable information on the tropical cyclone track forecast by reflecting the real-time performance of the numerical models.

Hydrologic Utilization of Radar-Derived Rainfall (II) Uncertainty Analysis (레이더 추정강우의 수문학적 활용 (II): 불확실성 해석)

  • Kim Jin-Hoon;Lee Kyoung-Do;Bae Deg-Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.38 no.12 s.161
    • /
    • pp.1051-1060
    • /
    • 2005
  • The present study analyzes hydrologic utilization of optimal radar-derived rainfall by using semi-distributed TOPMODEL and evaluates the impacts of radar rainfall and model parametric uncertainty on a hydrologic model. Monte Carlo technique is used to produce the flow ensembles. The simulated flows from the corrected radar rainfalls with real-time bias adjustment scheme are well agreed to observed flows during 22-26 July 2003. It is shown that radar-derived rainfall is useful for simulating streamflow on a basin scale. These results are diagnose with which radar-rainfall Input and parametric uncertainty influence the character of the flow simulation uncertainty. The main conclusions for this uncertainty analysis are that the radar input uncertainty is less influent than the parametric one, and combined uncertainty with radar and Parametric input can be included the highest uncertainty on a streamflow simulation.

Calculation of Low Aspect Ratio Wing Aerodynamics by Using Nonlinear Vortex Lattice Method (비선형 와류격자법을 이용한 낮은 종횡비 날개의 공력특성 계산)

  • Lee, Tae-Seung;Park, Seung-O
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.36 no.11
    • /
    • pp.1039-1048
    • /
    • 2008
  • new computational procedure for the Non-Linear Vortex Lattice Method (NLVLM) is suggested in this work. Conventional procedures suggested so far usually involves inner iteration loop to update free vortex shape and an under-relaxation based iteration loop to determine the free vortex shape. In this present work, we suggest a new formula based on quasi-steady concept to fix free vortex shape which eliminates the need for inner iteration loop. Further, the ensemble averaging of the induced velocities for a given free vortex segment evaluated at each iteration significantly improves the convergence property of the algorithm without resorting to the under-relaxation technique. Numerical experiments over several low aspect ratio wings are carried out to obtain optimal empirical parameters such as the length of the free vortex segment, the vortex core radius, and the rolled-up wake length.

A Development of a Tailored Follow up Management Model Using the Data Mining Technique on Hypertension (데이터마이닝 기법을 활용한 맞춤형 고혈압 사후관리 모형 개발)

  • Park, Il-Su;Yong, Wang-Sik;Kim, Yu-Mi;Kang, Sung-Hong;Han, Jun-Tae
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.639-647
    • /
    • 2008
  • This study used the characteristics of the knowledge discovery and data mining algorithms to develop tailored hypertension follow up management model - hypertension care predictive model and hypertension care compliance segmentation model - for hypertension management using the Korea National Health Insurance Corporation database(the insureds’ screening and health care benefit data). This study validated the predictive power of data mining algorithms by comparing the performance of logistic regression, decision tree, and ensemble technique. On the basis of internal and external validation, it was found that the model performance of logistic regression method was the best among the above three techniques on hypertension care predictive model and hypertension care compliance segmentation model was developed by Decision tree analysis. This study produced several factors affecting the outbreak of hypertension using screening. It is considered to be a contributing factor towards the nation’s building of a Hypertension follow up Management System in the near future by bringing forth representative results on the rise and care of hypertension.

Analyzing Machine Learning Techniques for Fault Prediction Using Web Applications

  • Malhotra, Ruchika;Sharma, Anjali
    • Journal of Information Processing Systems
    • /
    • v.14 no.3
    • /
    • pp.751-770
    • /
    • 2018
  • Web applications are indispensable in the software industry and continuously evolve either meeting a newer criteria and/or including new functionalities. However, despite assuring quality via testing, what hinders a straightforward development is the presence of defects. Several factors contribute to defects and are often minimized at high expense in terms of man-hours. Thus, detection of fault proneness in early phases of software development is important. Therefore, a fault prediction model for identifying fault-prone classes in a web application is highly desired. In this work, we compare 14 machine learning techniques to analyse the relationship between object oriented metrics and fault prediction in web applications. The study is carried out using various releases of Apache Click and Apache Rave datasets. En-route to the predictive analysis, the input basis set for each release is first optimized using filter based correlation feature selection (CFS) method. It is found that the LCOM3, WMC, NPM and DAM metrics are the most significant predictors. The statistical analysis of these metrics also finds good conformity with the CFS evaluation and affirms the role of these metrics in the defect prediction of web applications. The overall predictive ability of different fault prediction models is first ranked using Friedman technique and then statistically compared using Nemenyi post-hoc analysis. The results not only upholds the predictive capability of machine learning models for faulty classes using web applications, but also finds that ensemble algorithms are most appropriate for defect prediction in Apache datasets. Further, we also derive a consensus between the metrics selected by the CFS technique and the statistical analysis of the datasets.

Risk Factor Analysis of Cryopreserved Autologous Bone Flap Resorption in Adult Patients Undergoing Cranioplasty with Volumetry Measurement Using Conventional Statistics and Machine-Learning Technique

  • Yohan Son;Jaewoo Chung
    • Journal of Korean Neurosurgical Society
    • /
    • v.67 no.1
    • /
    • pp.103-114
    • /
    • 2024
  • Objective : Decompressive craniectomy (DC) with duroplasty is one of the common surgical treatments for life-threatening increased intracranial pressure (ICP). Once ICP is controlled, cranioplasty (CP) with reinsertion of the cryopreserved autologous bone flap or a synthetic implant is considered for protection and esthetics. Although with the risk of autologous bone flap resorption (BFR), cryopreserved autologous bone flap for CP is one of the important material due to its cost effectiveness. In this article, we performed conventional statistical analysis and the machine learning technique understand the risk factors for BFR. Methods : Patients aged >18 years who underwent autologous bone CP between January 2015 and December 2021 were reviewed. Demographic data, medical records, and volumetric measurements of the autologous bone flap volume from 94 patients were collected. BFR was defined with absolute quantitative method (BFR-A) and relative quantitative method (BFR%). Conventional statistical analysis and random forest with hyper-ensemble approach (RF with HEA) was performed. And overlapped partial dependence plots (PDP) were generated. Results : Conventional statistical analysis showed that only the initial autologous bone flap volume was statistically significant on BFR-A. RF with HEA showed that the initial autologous bone flap volume, interval between DC and CP, and bone quality were the factors with most contribution to BFR-A, while, trauma, bone quality, and initial autologous bone flap volume were the factors with most contribution to BFR%. Overlapped PDPs of the initial autologous bone flap volume on the BRF-A crossed at approximately 60 mL, and a relatively clear separation was found between the non-BFR and BFR groups. Therefore, the initial autologous bone flap of over 60 mL could be a possible risk factor for BFR. Conclusion : From the present study, BFR in patients who underwent CP with autologous bone flap might be inevitable. However, the degree of BFR may differ from one to another. Therefore, considering artificial bone flaps as implants for patients with large DC could be reasonable. Still, the risk factors for BFR are not clearly understood. Therefore, chronological analysis and pathophysiologic studies are needed.

Model Interpretation through LIME and SHAP Model Sharing (LIME과 SHAP 모델 공유에 의한 모델 해석)

  • Yong-Gil Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.177-184
    • /
    • 2024
  • In the situation of increasing data at fast speed, we use all kinds of complex ensemble and deep learning algorithms to get the highest accuracy. It's sometimes questionable how these models predict, classify, recognize, and track unknown data. Accomplishing this technique and more has been and would be the goal of intensive research and development in the data science community. A variety of reasons, such as lack of data, imbalanced data, biased data can impact the decision rendered by the learning models. Many models are gaining traction for such interpretations. Now, LIME and SHAP are commonly used, in which are two state of the art open source explainable techniques. However, their outputs represent some different results. In this context, this study introduces a coupling technique of LIME and Shap, and demonstrates analysis possibilities on the decisions made by LightGBM and Keras models in classifying a transaction for fraudulence on the IEEE CIS dataset.