• Title/Summary/Keyword: GA-MLR

Search Result 14, Processing Time 0.022 seconds

A DFT and QSAR Study of Several Sulfonamide Derivatives in Gas and Solvent

  • Abadi, Robabeh Sayyadi kord;Alizadehdakhel, Asghar;Paskiabei, Soghra Tajadodi
    • Journal of the Korean Chemical Society
    • /
    • v.60 no.4
    • /
    • pp.225-234
    • /
    • 2016
  • The activity of 34 sulfonamide derivatives has been estimated by means of multiple linear regression (MLR), artificial neural network (ANN), simulated annealing (SA) and genetic algorithm (GA) techniques. These models were also utilized to select the most efficient subsets of descriptors in a cross-validation procedure for non-linear -log (IC50) prediction. The results obtained using GA-ANN were compared with MLR-MLR, MLR-ANN, SA-ANN and GA-ANN approaches. A high predictive ability was observed for the MLR-MLR, MLR-ANN, SA-ANN and MLR-GA models, with root mean sum square errors (RMSE) of 0.3958, 0.1006, 0.0359, 0.0326 and 0.0282 in gas phase and 0.2871, 0.0475, 0.0268, 0.0376 and 0.0097 in solvent, respectively (N=34). The results obtained using the GA-ANN method indicated that the activity of derivatives of sulfonamides depends on different parameters including DP03, BID, AAC, RDF035v, JGI9, TIE, R7e+, BELM6 descriptors in gas phase and Mor 32u, ESpm03d, RDF070v, ATS8m, MATS2e and R4p, L1u and R3m in solvent. In conclusion, the comparison of the quality of the ANN with different MLR models showed that ANN has a better predictive ability.

Prediction of Melting Point for Drug-like Compounds Using Principal Component-Genetic Algorithm-Artificial Neural Network

  • Habibi-Yangjeh, Aziz;Pourbasheer, Eslam;Danandeh-Jenagharad, Mohammad
    • Bulletin of the Korean Chemical Society
    • /
    • v.29 no.4
    • /
    • pp.833-841
    • /
    • 2008
  • Principal component-genetic algorithm-multiparameter linear regression (PC-GA-MLR) and principal component-genetic algorithm-artificial neural network (PC-GA-ANN) models were applied for prediction of melting point for 323 drug-like compounds. A large number of theoretical descriptors were calculated for each compound. The first 234 principal components (PC’s) were found to explain more than 99.9% of variances in the original data matrix. From the pool of these PC’s, the genetic algorithm was employed for selection of the best set of extracted PC’s for PC-MLR and PC-ANN models. The models were generated using fifteen PC’s as variables. For evaluation of the predictive power of the models, melting points of 64 compounds in the prediction set were calculated. Root-mean square errors (RMSE) for PC-GA-MLR and PC-GA-ANN models are 48.18 and $12.77{^{\circ}C}$, respectively. Comparison of the results obtained by the models reveals superiority of the PC-GA-ANN relative to the PC-GA-MLR and the recently proposed models (RMSE = $40.7{^{\circ}C}$). The improvements are due to the fact that the melting point of the compounds demonstrates non-linear correlations with the principal components.

유전자 알고리듬을 이용한 다중이상치 탐색

  • Go Yeong-Hyeon;Lee Hye-Seon;Jeon Chi-Hyeok
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2000.11a
    • /
    • pp.173-179
    • /
    • 2000
  • Genetic algorithm(GA) is applied for detecting multiple outliers. GA is a heuristic optimization tool solving for near optimal solution. We compare the performance of GA and the other diagnostic measures commonly used for detecting outliers in regression model. The results show that GA seems to have better performance than the others for the detection of multiple outliers.

  • PDF

Quantitative Structure-Activity Relationships for Radical Scavenging Activities of Flavonoid Compounds by GA-MLR Technique

  • Om, Ae-Son;Ryu, Jae-Chun;Kim, Jae-Hyoun
    • Molecular & Cellular Toxicology
    • /
    • v.4 no.2
    • /
    • pp.170-176
    • /
    • 2008
  • The quantitative structure-activity relationship (QSAR) of a set of 35 flavonoid compounds presenting antioxidant activity was established by means of Genetic Algorithm-Multiple Linear Regression (GA-MLR) technique. Four-parametric models for two sets of data, the 1,1-diphenyl-2-picryl hydrazyl (DPPH) radical scavenging activity $(R^2=0.788,\;Q^2_{cv}=0.699\;and\;Q^2_{ext}=0.577)$ and scavenging activity of reactive oxgen species (ROS) induced by $H_2O_2 (R^=0.829,\;Q^2_{cv}=0.754\;and\;Q^2_{ext}=0.573)$ were obtained with low external predictive ability on a mass basis, respectively. Each model gave some different mechanistic aspects of the flavonoid compounds tested in terms of the radical scavenging activity. Topological charge, H-bonding complex and deprotonation processes were likely to be involved in the radical scavenging activity.

Quantitative Structure-Activity Relationship(QSAR) Study of New Fluorovinyloxycetamides

  • Jo, Du Ho;Lee, Seong Gwang;Kim, Beom Tae;No, Gyeong Tae
    • Bulletin of the Korean Chemical Society
    • /
    • v.22 no.4
    • /
    • pp.388-394
    • /
    • 2001
  • Quantitative Structure-Activity Relationship (QSAR) have been established of 57 fluorovinyloxyacetamides compounds to correlate and predict EC50 values. Genetic algorithm (GA) and multiple linear regression analysis were used to select the descriptors and to generate the equations that relate the structural features to the biological activities. This equation consists of three descriptors calculated from the molecular structures with molecular mechanics and quantum-chemical methods. The results of MLR and GA show that dipole moment of z-axis, radius of gyration and logP play an important role in growth inhibition of barnyard grass.

Assessment through Statistical Methods of Water Quality Parameters(WQPs) in the Han River in Korea

  • Kim, Jae Hyoun
    • Journal of Environmental Health Sciences
    • /
    • v.41 no.2
    • /
    • pp.90-101
    • /
    • 2015
  • Objective: This study was conducted to develop a chemical oxygen demand (COD) regression model using water quality monitoring data (January, 2014) obtained from the Han River auto-monitoring stations. Methods: Surface water quality data at 198 sampling stations along the six major areas were assembled and analyzed to determine the spatial distribution and clustering of monitoring stations based on 18 WQPs and regression modeling using selected parameters. Statistical techniques, including combined genetic algorithm-multiple linear regression (GA-MLR), cluster analysis (CA) and principal component analysis (PCA) were used to build a COD model using water quality data. Results: A best GA-MLR model facilitated computing the WQPs for a 5-descriptor COD model with satisfactory statistical results ($r^2=92.64$,$Q{^2}_{LOO}=91.45$,$Q{^2}_{Ext}=88.17$). This approach includes variable selection of the WQPs in order to find the most important factors affecting water quality. Additionally, ordination techniques like PCA and CA were used to classify monitoring stations. The biplot based on the first two principal components (PCs) of the PCA model identified three distinct groups of stations, but also differs with respect to the correlation with WQPs, which enables better interpretation of the water quality characteristics at particular stations as of January 2014. Conclusion: This data analysis procedure appears to provide an efficient means of modelling water quality by interpreting and defining its most essential variables, such as TOC and BOD. The water parameters selected in a COD model as most important in contributing to environmental health and water pollution can be utilized for the application of water quality management strategies. At present, the river is under threat of anthropogenic disturbances during festival periods, especially at upstream areas.

Prediction of unconfined compressive strength ahead of tunnel face using measurement-while-drilling data based on hybrid genetic algorithm

  • Liu, Jiankang;Luan, Hengjie;Zhang, Yuanchao;Sakaguchi, Osamu;Jiang, Yujing
    • Geomechanics and Engineering
    • /
    • v.22 no.1
    • /
    • pp.81-95
    • /
    • 2020
  • Measurement of the unconfined compressive strength (UCS) of the rock is critical to assess the quality of the rock mass ahead of a tunnel face. In this study, extensive field studies have been conducted along 3,885 m of the new Nagasaki tunnel in Japan. To predict UCS, a hybrid model of artificial neural network (ANN) based on genetic algorithm (GA) optimization was developed. A total of 1350 datasets, including six parameters of the Measurement-While- Drilling data and the UCS were considered as input and output parameters respectively. The multiple linear regression (MLR) and the ANN were employed to develop contrast models. The results reveal that the developed GA-ANN hybrid model can predict UCS with higher performance than the ANN and MLR models. This study is of great significance for accurately and effectively evaluating the quality of rock masses in tunnel engineering.

Chemical Oxygen Demand (COD) Model for the Assessment of Water Quality in the Han River, Korea (한강수질 평가를 위한 COD (화학적 산소 요구량) 모델 평가)

  • Kim, Jae Hyoun;Jo, Jinnam
    • Journal of Environmental Health Sciences
    • /
    • v.42 no.4
    • /
    • pp.280-292
    • /
    • 2016
  • Objectives: The objective of this study was to build COD regression models for the Han River and evaluate water quality. Methods: Water quality data sets for the dry season (as of January) during a four-year period (2012-2015) were collected from the database of the Han River automatic water quality monitoring stations. Statistical techniques, including combined genetic algorithm-multiple linear regression (GA-MLR) were used to build five-descriptor COD models. Multivariate statistical techniques such as principal component analysis (PCA) and cluster analysis (CA) are useful tools for extracting meaningful information. Results: The $r^2$ of the best COD models provided significant high values (> 0.8) between 2012 and 2015. Total organic carbon (TOC) was a surrogate indicator for COD (as COD/TOC) with high reliability ($r^2=0.63$ in 2012, $r^2=0.75$ for 2013, $r^2=0.79$ for 2014 and $r^2=0.85$ for 2015). The ratios of COD/TOC were calculated as 2.08 in 2012, 1.79 in 2013, 1.52 and 1.45 in 2015, indicating that biodegradability in the water body of the Han River was being sustained, thereby further improving water quality. The BOD/COD ratio supported these findings. The cluster analysis revealed higher annual levels of microorganisms and phosphorous at stations along the Hangang-Seoul and Hantangang areas. Nevertheless, the overall water quality over the last four years showed an observable trend toward continuous improvement. These findings also suggest that non-point pollution control strategies should consider the influence of upstreams and downstreams to protect water quality in the Han River. Conclusion: This data analysis procedure provided an efficient and comprehensive tool to interpret complex water quality data matrices. Results from a trend analysis provided much important information about sources and parameters for Han River water quality management.

A Study on the Hydroclimatic Effects on the Estimation of Annual Actual Evapotranspiration Using Watershed Water Balance (유역 물수지를 이용한 연 실제증발산 산정에 미치는 수문기후 영향 연구)

  • Rim, Chang-Soo;Lim, Ga-Hui;Yoon, Sei-Eui
    • Journal of Korea Water Resources Association
    • /
    • v.44 no.12
    • /
    • pp.915-928
    • /
    • 2011
  • The main purpose of this study is to understand the effects of hydroclimatic factors on annual actual evapotranspiration and to suggest the multiple linear regression (MLR) equations for the estimation of annual actual evapotranspiration from watershed. To accomplish this study purpose, 5 dam watersheds (Goesan dam, Seomjingang dam, Soyanggang dam, Andong dam, Hapcheon dam) were selected as study watersheds and annual actual evapotranspiration was estimated based on annual water balance analysis from each watershed. The estimated annual actual evapotranspiration from water balance analysis was used to evaluate the MLR equations. Furthermore, the possibility of the estimation of actual evapotranspiration using potential evapotranspiration equations (Penman equation, FAO P-M equation, Makkink equation, Preistley-Taylor equation, Hargreaves equation) was evaluated. It has turned out that it is not appropriate to use potential evapotranspiration for the estimation of actual evapotranspiration because the correlation between actual evapotranspiration and potential evapotranspiration is very low. The comparison of MLR equations with current actual evapotranspiration equations indicates that MLR equations can be used for the estimation of annual actual evapotranspiration. Furthermore, it has turned out that the effects of hydroclimatic factors on annual actual evapotranspiration from dam watersheds are different in each watershed; however, for all watersheds in common precipitation has turned out to be the most important climatic factor affecting on the estimation of annual actual evapotranspiration.

Concrete compressive strength prediction using the imperialist competitive algorithm

  • Sadowski, Lukasz;Nikoo, Mehdi;Nikoo, Mohammad
    • Computers and Concrete
    • /
    • v.22 no.4
    • /
    • pp.355-363
    • /
    • 2018
  • In the following paper, a socio-political heuristic search approach, named the imperialist competitive algorithm (ICA) has been used to improve the efficiency of the multi-layer perceptron artificial neural network (ANN) for predicting the compressive strength of concrete. 173 concrete samples have been investigated. For this purpose the values of slump flow, the weight of aggregate and cement, the maximum size of aggregate and the water-cement ratio have been used as the inputs. The compressive strength of concrete has been used as the output in the hybrid ICA-ANN model. Results have been compared with the multiple-linear regression model (MLR), the genetic algorithm (GA) and particle swarm optimization (PSO). The results indicate the superiority and high accuracy of the hybrid ICA-ANN model in predicting the compressive strength of concrete when compared to the other methods.