• Title/Summary/Keyword: Machine learning (ML)

Search Result 290, Processing Time 0.026 seconds

Evaluation of the Coverage Assessment of Rainfall-Runoff Model for Data Length (데이터 길이에 대한 강우-유출 모델 적용범위 평가)

  • Jeon Seong Jae;Shin Mun Ju;Jung Yong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.383-383
    • /
    • 2023
  • 오늘날 수문학 분야에서는 유역에 대한 강우-유출 시뮬레이션을 머신 러닝(ML: Machine Learning)을 활용하여 다양한 연구를 실행하고 있다. 본 연구에서는 시간별 강우-유출 예측 모델인 GR4H(Génie Rural à 4 paramètres Horaires)를 사용하여 충주댐 유역을 대상으로 연구를 수행하였다. 유역의 속성에 따라서 모델의 성능이 어떻게 달라지는지 비교하여 특성에 맞는 모델을 알아내고. 또한 이 과정에서 기상 및 유출 데이터의 보정 길이를 가지고 어느 정도의 데이터 기간이 모델에서 좋은 성능을 보이는지 파악하였다. 뿐만 아니라 모델에 필요한 선행기간의 데이터가 있는 경우와 없는 경우를 비교하여 어떠한 차이를 보이는지, 그리고 선행기간은 얼마나 필요한지 연구를 통하여 알아냈다. 본 연구를 통하여 충주댐 유역에 대한 모델의 적용성 및 성능을 파악하고 수문 모형 구축에 제한이 있는 유역에 대해서도 사용이 가능한지 판단한다. 실험 유역의 관측 값을 모델에 입력한 후 각 모델에 해당하는 매개변수의 최적값을 찾아내는 과정을 거쳐 시뮬레이션을실 행했다. 본 연구에서 사용한 강우-유출 모델인 GR4H는 프랑스의 INRAE-Antony(Institut National de la recherche agronomique-Antony)에서 만들어진 airGR의 일종으로, 시간별 강우-유출 예측을 위해 개발된 공정 기반(process-based)의 집중적, 개념적 수문학 모델이다. 4개의 매개변수(parameter)가 있으며 이는 유역의 특정 속성을 나타낸다. GR4H를 시뮬레이션 하는 과정에서 매개변수의 최적화를 위해 적절한 보정 길이를 파악하여야 한다. 이러한 과정은 4년, 5년, 6년 등 1년씩 데이터의 양을 늘려가며 매개변수를 최적화한다. 이 과정에서 기상 및 유출 데이터의 적절한 보정 길이를 찾아낸다. 시뮬레이션을 통해 얻은 데이터를 관측 값과 비교하여 모델의 성능을 평가하고 다른 관측 값을 통해 시뮬레이션을 실행하여 검증을 거친다.

  • PDF

Meta-heuristic optimization algorithms for prediction of fly-rock in the blasting operation of open-pit mines

  • Mahmoodzadeh, Arsalan;Nejati, Hamid Reza;Mohammadi, Mokhtar;Ibrahim, Hawkar Hashim;Rashidi, Shima;Mohammed, Adil Hussein
    • Geomechanics and Engineering
    • /
    • v.30 no.6
    • /
    • pp.489-502
    • /
    • 2022
  • In this study, a Gaussian process regression (GPR) model as well as six GPR-based metaheuristic optimization models, including GPR-PSO, GPR-GWO, GPR-MVO, GPR-MFO, GPR-SCA, and GPR-SSO, were developed to predict fly-rock distance in the blasting operation of open pit mines. These models included GPR-SCA, GPR-SSO, GPR-MVO, and GPR. In the models that were obtained from the Soungun copper mine in Iran, a total of 300 datasets were used. These datasets included six input parameters and one output parameter (fly-rock). In order to conduct the assessment of the prediction outcomes, many statistical evaluation indices were used. In the end, it was determined that the performance prediction of the ML models to predict the fly-rock from high to low is GPR-PSO, GPR-GWO, GPR-MVO, GPR-MFO, GPR-SCA, GPR-SSO, and GPR with ranking scores of 66, 60, 54, 46, 43, 38, and 30 (for 5-fold method), respectively. These scores correspond in conclusion, the GPR-PSO model generated the most accurate findings, hence it was suggested that this model be used to forecast the fly-rock. In addition, the mutual information test, also known as MIT, was used in order to investigate the influence that each input parameter had on the fly-rock. In the end, it was determined that the stemming (T) parameter was the most effective of all the parameters on the fly-rock.

Practical applicable model for estimating the carbonation depth in fly-ash based concrete structures by utilizing adaptive neuro-fuzzy inference system

  • Aman Kumar;Harish Chandra Arora;Nishant Raj Kapoor;Denise-Penelope N. Kontoni;Krishna Kumar;Hashem Jahangir;Bharat Bhushan
    • Computers and Concrete
    • /
    • v.32 no.2
    • /
    • pp.119-138
    • /
    • 2023
  • Concrete carbonation is a prevalent phenomenon that leads to steel reinforcement corrosion in reinforced concrete (RC) structures, thereby decreasing their service life as well as durability. The process of carbonation results in a lower pH level of concrete, resulting in an acidic environment with a pH value below 12. This acidic environment initiates and accelerates the corrosion of steel reinforcement in concrete, rendering it more susceptible to damage and ultimately weakening the overall structural integrity of the RC system. Lower pH values might cause damage to the protective coating of steel, also known as the passive film, thus speeding up the process of corrosion. It is essential to estimate the carbonation factor to reduce the deterioration in concrete structures. A lot of work has gone into developing a carbonation model that is precise and efficient that takes both internal and external factors into account. This study presents an ML-based adaptive-neuro fuzzy inference system (ANFIS) approach to predict the carbonation depth of fly ash (FA)-based concrete structures. Cement content, FA, water-cement ratio, relative humidity, duration, and CO2 level have been used as input parameters to develop the ANFIS model. Six performance indices have been used for finding the accuracy of the developed model and two analytical models. The outcome of the ANFIS model has also been compared with the other models used in this study. The prediction results show that the ANFIS model outperforms analytical models with R-value, MAE, RMSE, and Nash-Sutcliffe efficiency index values of 0.9951, 0.7255 mm, 1.2346 mm, and 0.9957, respectively. Surface plots and sensitivity analysis have also been performed to identify the repercussion of individual features on the carbonation depth of FA-based concrete structures. The developed ANFIS-based model is simple, easy to use, and cost-effective with good accuracy as compared to existing models.

A novel analytical evaluation of the laboratory-measured mechanical properties of lightweight concrete

  • S. Sivakumar;R. Prakash;S. Srividhya;A.S. Vijay Vikram
    • Structural Engineering and Mechanics
    • /
    • v.87 no.3
    • /
    • pp.221-229
    • /
    • 2023
  • Urbanization and industrialization have significantly increased the amount of solid waste produced in recent decades, posing considerable disposal problems and environmental burdens. The practice of waste utilization in concrete has gained popularity among construction practitioners and researchers for the efficient use of resources and the transition to the circular economy in construction. This study employed Lytag aggregate, an environmentally friendly pulverized fuel ash-based lightweight aggregate, as a substitute for natural coarse aggregate. At the same time, fly ash, an industrial by-product, was used as a partial substitute for cement. Concrete mix M20 was experimented with using fly ash and Lytag lightweight aggregate. The percentages of fly ash that make up the replacements were 5%, 10%, 15%, 20%, and 25%. The Compressive Strength (CS), Split Tensile Strength (STS), and deflection were discovered at these percentages after 56 days of testing. The concrete cube, cylinder, and beam specimens were examined in the explorations, as mentioned earlier. The results indicate that a 10% substitution of cement with fly ash and a replacement of coarse aggregate with Lytag lightweight aggregate produced concrete that performed well in terms of mechanical properties and deflection. The cementitious composites have varying characteristics as the environment changes. Therefore, understanding their mechanical properties are crucial for safety reasons. CS, STS, and deflection are the essential property of concrete. Machine learning (ML) approaches have been necessary to predict the CS of concrete. The Artificial Fish Swarm Optimization (AFSO), Particle Swarm Optimization (PSO), and Harmony Search (HS) algorithms were investigated for the prediction of outcomes. This work deftly explains the tremendous AFSO technique, which achieves the precise ideal values of the weights in the model to crown the mathematical modeling technique. This has been proved by the minimum, maximum, and sample median, and the first and third quartiles were used as the basis for a boxplot through the standardized method of showing the dataset. It graphically displays the quantitative value distribution of a field. The correlation matrix and confidence interval were represented graphically using the corrupt method.

Comparison of RANS, URANS, SAS and IDDES for the prediction of train crosswind characteristics

  • Xiao-Shuai Huo;Tang-Hong Liu;Zheng-Wei Chen;Wen-Hui Li;Hong-Rui Gao;Bin Xu
    • Wind and Structures
    • /
    • v.37 no.4
    • /
    • pp.303-314
    • /
    • 2023
  • In this study, two steady RANS turbulence models (SST k-ω and Realizable k-ε) and four unsteady turbulence models (URANS SST k-ω and Realizable k-ε, SST-SAS, and SST-IDDES) are evaluated with respect to their capacity to predict crosswind characteristics on high-speed trains (HSTs). All of the numerical simulations are compared with the wind tunnel values and LES results to ensure the accuracy of each turbulence model. Specifically, the surface pressure distributions, time-averaged aerodynamic coefficients, flow fields, and computational cost are studied to determine the suitability of different models. Results suggest that the predictions of the pressure distributions and aerodynamic forces obtained from the steady and transient RANS models are almost the same. In particular, both SAS and IDDES exhibits similar predictions with wind tunnel test and LES, therefore, the SAS model is considered an attractive alternative for IDDES or LES in the crosswind study of trains. In addition, if the computational cost needs to be significantly reduced, the RANS SST k-ω model is shown to provide relatively reasonable results for the surface pressures and aerodynamic forces. As a result, the RANS SST k-ω model might be the most appropriate option for the expensive aerodynamic optimizations of trains using machine learning (ML) techniques because it balances solution accuracy and resource consumption.

Seismic Data Processing Using BERT-Based Pretraining: Comparison of Shotgather Arrays (BERT 기반 사전학습을 이용한 탄성파 자료처리: 송신원 모음 배열 비교)

  • Youngjae Shin
    • Geophysics and Geophysical Exploration
    • /
    • v.27 no.3
    • /
    • pp.171-180
    • /
    • 2024
  • The processing of seismic data involves analyzing earthquake wave data to understand the internal structure and characteristics of the Earth, which requires high computational power. Recently, machine learning (ML) techniques have been introduced to address these challenges and have been utilized in various tasks such as noise reduction and velocity model construction. However, most studies have focused on specific seismic data processing tasks, limiting the full utilization of similar features and structures inherent in the datasets. In this study, we compared the efficacy of using receiver-wise time-series data ("receiver array") and synchronized receiver signals ("time array") from shotgathers for pretraining a Bidirectional Encoder Representations from Transformers (BERT) model. To this end, shotgather data generated from a synthetic model containing faults was used to perform noise reduction, velocity prediction, and fault detection tasks. In the task of random noise reduction, both the receiver and time arrays showed good performance. However, for tasks requiring the identification of spatial distributions, such as velocity estimation and fault detection, the results from the time array were superior.

Quantitative Estimation Method for ML Model Performance Change, Due to Concept Drift (Concept Drift에 의한 ML 모델 성능 변화의 정량적 추정 방법)

  • Soon-Hong An;Hoon-Suk Lee;Seung-Hoon Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.6
    • /
    • pp.259-266
    • /
    • 2023
  • It is very difficult to measure the performance of the machine learning model in the business service stage. Therefore, managing the performance of the model through the operational department is not done effectively. Academically, various studies have been conducted on the concept drift detection method to determine whether the model status is appropriate. The operational department wants to know quantitatively the performance of the operating model, but concept drift can only detect the state of the model in relation to the data, it cannot estimate the quantitative performance of the model. In this study, we propose a performance prediction model (PPM) that quantitatively estimates precision through the statistics of concept drift. The proposed model induces artificial drift in the sampling data extracted from the training data, measures the precision of the sampling data, creates a dataset of drift and precision, and learns it. Then, the difference between the actual precision and the predicted precision is compared through the test data to correct the error of the performance prediction model. The proposed PPM was applied to two models, a loan underwriting model and a credit card fraud detection model that can be used in real business. It was confirmed that the precision was effectively predicted.

Domain Knowledge Incorporated Local Rule-based Explanation for ML-based Bankruptcy Prediction Model (머신러닝 기반 부도예측모형에서 로컬영역의 도메인 지식 통합 규칙 기반 설명 방법)

  • Soo Hyun Cho;Kyung-shik Shin
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.105-123
    • /
    • 2022
  • Thanks to the remarkable success of Artificial Intelligence (A.I.) techniques, a new possibility for its application on the real-world problem has begun. One of the prominent applications is the bankruptcy prediction model as it is often used as a basic knowledge base for credit scoring models in the financial industry. As a result, there has been extensive research on how to improve the prediction accuracy of the model. However, despite its impressive performance, it is difficult to implement machine learning (ML)-based models due to its intrinsic trait of obscurity, especially when the field requires or values an explanation about the result obtained by the model. The financial domain is one of the areas where explanation matters to stakeholders such as domain experts and customers. In this paper, we propose a novel approach to incorporate financial domain knowledge into local rule generation to provide explanations for the bankruptcy prediction model at instance level. The result shows the proposed method successfully selects and classifies the extracted rules based on the feasibility and information they convey to the users.

Adverse Effects on EEGs and Bio-Signals Coupling on Improving Machine Learning-Based Classification Performances

  • SuJin Bak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.10
    • /
    • pp.133-153
    • /
    • 2023
  • In this paper, we propose a novel approach to investigating brain-signal measurement technology using Electroencephalography (EEG). Traditionally, researchers have combined EEG signals with bio-signals (BSs) to enhance the classification performance of emotional states. Our objective was to explore the synergistic effects of coupling EEG and BSs, and determine whether the combination of EEG+BS improves the classification accuracy of emotional states compared to using EEG alone or combining EEG with pseudo-random signals (PS) generated arbitrarily by random generators. Employing four feature extraction methods, we examined four combinations: EEG alone, EG+BS, EEG+BS+PS, and EEG+PS, utilizing data from two widely-used open datasets. Emotional states (task versus rest states) were classified using Support Vector Machine (SVM) and Long Short-Term Memory (LSTM) classifiers. Our results revealed that when using the highest accuracy SVM-FFT, the average error rates of EEG+BS were 4.7% and 6.5% higher than those of EEG+PS and EEG alone, respectively. We also conducted a thorough analysis of EEG+BS by combining numerous PSs. The error rate of EEG+BS+PS displayed a V-shaped curve, initially decreasing due to the deep double descent phenomenon, followed by an increase attributed to the curse of dimensionality. Consequently, our findings suggest that the combination of EEG+BS may not always yield promising classification performance.

Data-centric XAI-driven Data Imputation of Molecular Structure and QSAR Model for Toxicity Prediction of 3D Printing Chemicals (3D 프린팅 소재 화학물질의 독성 예측을 위한 Data-centric XAI 기반 분자 구조 Data Imputation과 QSAR 모델 개발)

  • ChanHyeok Jeong;SangYoun Kim;SungKu Heo;Shahzeb Tariq;MinHyeok Shin;ChangKyoo Yoo
    • Korean Chemical Engineering Research
    • /
    • v.61 no.4
    • /
    • pp.523-541
    • /
    • 2023
  • As accessibility to 3D printers increases, there is a growing frequency of exposure to chemicals associated with 3D printing. However, research on the toxicity and harmfulness of chemicals generated by 3D printing is insufficient, and the performance of toxicity prediction using in silico techniques is limited due to missing molecular structure data. In this study, quantitative structure-activity relationship (QSAR) model based on data-centric AI approach was developed to predict the toxicity of new 3D printing materials by imputing missing values in molecular descriptors. First, MissForest algorithm was utilized to impute missing values in molecular descriptors of hazardous 3D printing materials. Then, based on four different machine learning models (decision tree, random forest, XGBoost, SVM), a machine learning (ML)-based QSAR model was developed to predict the bioconcentration factor (Log BCF), octanol-air partition coefficient (Log Koa), and partition coefficient (Log P). Furthermore, the reliability of the data-centric QSAR model was validated through the Tree-SHAP (SHapley Additive exPlanations) method, which is one of explainable artificial intelligence (XAI) techniques. The proposed imputation method based on the MissForest enlarged approximately 2.5 times more molecular structure data compared to the existing data. Based on the imputed dataset of molecular descriptor, the developed data-centric QSAR model achieved approximately 73%, 76% and 92% of prediction performance for Log BCF, Log Koa, and Log P, respectively. Lastly, Tree-SHAP analysis demonstrated that the data-centric-based QSAR model achieved high prediction performance for toxicity information by identifying key molecular descriptors highly correlated with toxicity indices. Therefore, the proposed QSAR model based on the data-centric XAI approach can be extended to predict the toxicity of potential pollutants in emerging printing chemicals, chemical process, semiconductor or display process.