• 제목/요약/키워드: Multiple Linear Regression (MLR)

검색결과 125건 처리시간 0.031초

유기화합물의 승화열 예측을 위한 QSPR분석 (QSPR analysis for predicting heat of sublimation of organic compounds)

  • 박유선;이종혁;박한웅;이성광
    • 분석과학
    • /
    • 제28권3호
    • /
    • pp.187-195
    • /
    • 2015
  • 승화열은 대기 유기 오염물질의 확산에 관련된 환경적인 문제를 해결하거나, 위험한 화학 물질의 위해성을 평가하는 데에 중요한 변수이다. 하지만 실험적으로 승화열을 측정하려면 많은 시간과 비용이 소모 되며, 그 실험자체도 복잡하고 위험하다. 따라서 본 연구에서는 유기화합물의 승화열을 간단하게 예측하는 모델을 개발하기 위하여 정량적 구조-물성 상관관계 연구를 이용하였다. 군기반 전진선택방법을 적용하여 다중선형회귀방법과 서포트 벡터 머신과 같은 학습방법에 적합한 분자표현자들을 선택하도록 하였다. 개별 모델과 복합모델들은 부스트래핑 방법과 y-임의추출법에 의해 내부검증이 되었다. 외부 테스트 데이터의 예측 성능은 적용범위를 고려하므로서 개선되었다. 다중선형회귀모델에 따르면, 승화열은 분자간의 분산력, 수소결합, 정전기적 상호작용, 쌍극자-쌍극자 상호작용과 관련이 있는 것을 나타낼 수 있었다.

Vitamin C Tablet Assay by Near -Infrared Reflectance spectrometry

  • Kargosha, Kazem;Ahmadi, Hamid;Nemati, Nader
    • 한국근적외분광분석학회:학술대회논문집
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.4111-4111
    • /
    • 2001
  • When a drug is prepared in a tablet, the active component represents only a small portion of the dosage form. The other components of the formulation include materials to assist in the dissolution, antioxidants, coloring agents and bulk fillers. The tablets are tested using approved testing methods usually involving separation and subsequent quantification of the active component. Tablets may also be tested by near-Infrared Reflectance spectrometry (NIRS). In the present study, based on NIRS and multivariate calibration methods, a novel and precise method is developed for direct determination of ascorbic acid in vitamin C tablet. Two different tablet formulations were powdered in three different sizes, 63-125 ${\mu}{\textrm}{m}$, and examined. Spectral region of 4750-4950 $cm^{-1}$ / was used and optimized for quantitative operations. Partial least squares (PLS) and multiple linear regression (MLR) methods were performed for this spectral region. The results of optimized PLS and MLR methods showed that reproducibility increase with decreasing grain size and standard error of calibration (SEP) of less than 1% w/w of ascorbic acid and a correlation coefficient of 0.998 can be achieved. The PLS method showed better results than MLR. Seven overdose and underdose samples (prepared in the laboratory to match marketed products) were tested by proposed and iodometric standard methods. A correlation between NIRS predicted ascorbic acid values and iodomet.ic values was calculated ($R^2$=0.9950). Finally, the direct analysis of individual intact tablets in their unit-dose packages (Blistering in aluminum and PVC foils) obtained from market were also carried out and a correlation coefficient of 0.9989 and SEP of 0.931% w/w of ascorbic acid were achieved.

  • PDF

Application of Near Infrared Spectroscopy for Nondestructive Evaluation of Nitrogen Content in Ginseng

  • Lin, Gou-lin;Sohn, Mi-Ryeong;Kim, Eun-Ok;Kwon, Young-Kil;Cho, Rae-Kwang
    • 한국근적외분광분석학회:학술대회논문집
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.1528-1528
    • /
    • 2001
  • Ginseng cultivated in different country or growing condition has generally different components such as saponin and protein, and it relates to efficacy and action. Protein content assumes by nitrogen content in ginseng radix. Nitrogen content could be determined by chemical analysis such as kjeldahl or extraction methods. However, these methods require long analysis time and result environmental pollution and sample damage. In this work we investigated possibility of non-destructive determination of nitrogen content in ginseng radix using near-infrared spectroscopy. Ginseng radix, root of Panax ginseng C. A. Meyer, was studied. Total 120 samples were used in this study and it was consisted of 6 sample sets, 4, 5 and 6-year-old Korea ginseng and 7, 8 and 9-year-old China ginseng, respectively. Each sample set has 20 sample. Nigrogen content was measured by electronic analysis. NIR reflectance spectra were collected over the 1100 to 2500 nm spectral region with a InfraAlyzer 500C (Bran+Luebbe, Germany) equipped with a halogen lapmp and PbS detector and data were collected every 2 nm data point intervals. The calibration models were carried out by multiple linear regression (MLR) and partial least squares (PLS) analysis using IDAS and SESAME software. Result of electronic analysis, Korean ginseng were different mean value in nitrogen content of China ginseng. Ginseng tend to generally decrease the nitrogen content according as cultivation year is over 6 years. The MLR calibration model with 8 wavelengths using IDAS software accurately predicted nitrogen contents with correlation coefficient (R) and standard error of prediction of 0.985 and 0.855%, respectively. In case of SESAME software, the MLR calibration with 9 wavelength was selected the best calibration, R and SEP were 0.972 and 0.596%, respectively. The PLSR calibration model result in 0.969 of R and 0.630 of RMSEP. This study shows the NIR spectroscopy could be applied to determine the nitrogen content in ginseng radix with high accuracy.

  • PDF

Prediction of the Toxicity of Dimethylformamide, Methyl Ethyl Ketone, and Toluene Mixtures by QSAR Modeling

  • Kim, Ki-Woong;Won, Yong Lim;Hong, Mun Ki;Jo, Jihoon;Lee, Sung Kwang
    • Bulletin of the Korean Chemical Society
    • /
    • 제35권12호
    • /
    • pp.3637-3641
    • /
    • 2014
  • In this study, we analyzed the toxicity of mixtures of dimethylformamide (DMF) and methyl ethyl ketone (MEK) or DMF and toluene (TOL) and predicted their toxicity using quantitative structure-activity relationships (QSAR). A QSAR model for single substances and mixtures was analyzed using multiple linear regression (MLR) by taking into account the statistical parameters between the observed and predicted $EC_{50}$. After preprocessing, the best subsets of descriptors in the learning methods were determined using a 5-fold cross-validation method. Significant differences in physico-chemical properties such as boiling point (BP), specific gravity (SG), Reid vapor pressure (rVP), flash point (FP), low explosion limit (LEL), and octanol/water partition coefficient (Pow) were observed between the single substances and the mixtures. The $EC_{50}$ of the mixture of DMF and TOL was significantly lower than that of DMF. The mixture toxicity was directly related to the mixing ratio of TOL and MEK (MLR $EC_{50}$ equation = $1.76997-1.12249{\times}TOL+1.21045{\times}MEK$), as well as to SG, VP, and LEL (MLR equation $EC_{50}=15.44388-19.84549{\times}SG+0.05091{\times}VP+1.85846{\times}LEL$). These results show that QSAR-based models can be used to quantitatively predict the toxicity of mixtures used in manufacturing industries.

저수지 CO2 배출량 산정을 위한 기계학습 모델의 적용 (Applications of Machine Learning Models for the Estimation of Reservoir CO2 Emissions)

  • 유지수;정세웅;박형석
    • 한국물환경학회지
    • /
    • 제33권3호
    • /
    • pp.326-333
    • /
    • 2017
  • The lakes and reservoirs have been reported as important sources of carbon emissions to the atmosphere in many countries. Although field experiments and theoretical investigations based on the fundamental gas exchange theory have proposed the quantitative amounts of Net Atmospheric Flux (NAF) in various climate regions, there are still large uncertainties at the global scale estimation. Mechanistic models can be used for understanding and estimating the temporal and spatial variations of the NAFs considering complicated hydrodynamic and biogeochemical processes in a reservoir, but these models require extensive and expensive datasets and model parameters. On the other hand, data driven machine learning (ML) algorithms are likely to be alternative tools to estimate the NAFs in responding to independent environmental variables. The objective of this study was to develop random forest (RF) and multi-layer artificial neural network (ANN) models for the estimation of the daily $CO_2$ NAFs in Daecheong Reservoir located in Geum River of Korea, and compare the models performance against the multiple linear regression (MLR) model that proposed in the previous study (Chung et al., 2016). As a result, the RF and ANN models showed much enhanced performance in the estimation of the high NAF values, while MLR model significantly under estimated them. Across validation with 10-fold random samplings was applied to evaluate the performance of three models, and indicated that the ANN model is best, and followed by RF and MLR models.

Comparison of daily solar flare peak flux forecast models based on regressive and neural network methods

  • Shin, Seulki;Lee, Jin-Yi;Moon, Yong-Jae
    • 천문학회보
    • /
    • 제39권1호
    • /
    • pp.75.2-75.2
    • /
    • 2014
  • We have developed a set of daily solar flare peak flux forecast models using the multiple linear regression (MLR), the auto regression (AR), and artificial neural network (ANN) methods. We consider input parameters as solar activity data from January 1996 to December 2013 such as sunspot area, X-ray flare peak flux, weighted total flux $T_F=1{\times}F_C+10{\times}F_M+100{\times}F_X$ of previous day, mean flare rates of a given McIntosh sunspot group (Zpc), and a Mount Wilson magnetic classification. We compute the hitting rate that is defined as the fraction of the events whose absolute differences between the observed and predicted flare fluxes in a logarithm scale are ${\leq}$ 0.5. The best three parameters related to the observed flare peak flux are as follows: weighted total flare flux of previous day (r=0.5), Mount Wilson magnetic classification (r=0.33), and McIntosh sunspot group (r=0.3). The hitting rates of flares stronger than the M5 class, which is regarded to be significant for space weather forecast, are as follows: 30% for the auto regression method and 69% for the neural network method.

  • PDF

Development of robust Calibration for Determination Sweetness of Fuji Apple fruit using Near Infrared Reflectance Spectroscopy

  • Sohn, Mi-Ryeong;Kwon, Young-Kill;Cho, Rae-Kwang
    • Near Infrared Analysis
    • /
    • 제2권1호
    • /
    • pp.55-58
    • /
    • 2001
  • The object of this work was to investigate the influence of growing district and harvest year on calibration for sweetness (Brix) determination of Fuji apple fruit using near infrared (NIR) reflectance spectroscopy, and to develop the robust calibration across these variation. The calibration models was based on wavelength range of 1100∼2500 nm using a stepwise multiple linear regression. A calibration model by sample set of one growing district was not transferable to other growing districts. The combined calibration (data of three growing districts) predicted reasonable well against a population set drawn from all growing districts (SEP=0.69, Bias=0.075). A calibration model by sample set of one harvest year was not also transferable to other harvest years. The combined calibration (data of three harvest years) predicted well against a population set drawn from all harvest years (SEP=0.53, Bias=0.004).

Concrete compressive strength prediction using the imperialist competitive algorithm

  • Sadowski, Lukasz;Nikoo, Mehdi;Nikoo, Mohammad
    • Computers and Concrete
    • /
    • 제22권4호
    • /
    • pp.355-363
    • /
    • 2018
  • In the following paper, a socio-political heuristic search approach, named the imperialist competitive algorithm (ICA) has been used to improve the efficiency of the multi-layer perceptron artificial neural network (ANN) for predicting the compressive strength of concrete. 173 concrete samples have been investigated. For this purpose the values of slump flow, the weight of aggregate and cement, the maximum size of aggregate and the water-cement ratio have been used as the inputs. The compressive strength of concrete has been used as the output in the hybrid ICA-ANN model. Results have been compared with the multiple-linear regression model (MLR), the genetic algorithm (GA) and particle swarm optimization (PSO). The results indicate the superiority and high accuracy of the hybrid ICA-ANN model in predicting the compressive strength of concrete when compared to the other methods.

Quantitative structure activity relationship (QSAR) between chlorinated alkene ELUMO and their chlorine

  • Tang, Walter Z.;Wang, Fang
    • Advances in environmental research
    • /
    • 제1권4호
    • /
    • pp.257-276
    • /
    • 2012
  • QSAR models for chlorinated alkenes were developed between $E_{HOMO}$ and their chlorine and carbon content. The aim is to provide valid QSAR model which is statistically validated for $E_{LUMO}$ prediction. Different molecular descriptors, $N_{Cl}$, $N_C$ and $E_{HOMO}$ have been used to take into account relevant information provided by molecular features and physicochemical properties. The best model were selected using Partial Least Square (PLS) and Multiple Linear Regression (MLR) led to models with satisfactory predictive ability for a data set of 15 chlorinated alkene compounds.

Prediction of Thermal Decomposition Temperature of Polymers Using QSPR Methods

  • Ajloo, Davood;Sharifian, Ali;Behniafar, Hossein
    • Bulletin of the Korean Chemical Society
    • /
    • 제29권10호
    • /
    • pp.2009-2016
    • /
    • 2008
  • The relationship between thermal decomposition temperature and structure of a new data set of eighty monomers of different polymers were studied by multiple linear regression (MLR). The stepwise method was used in order to variable selection. The best descriptors were selected from over 1400 descriptors including; topological, geometrical, electronic and hybrid descriptors. The effect of number of descriptors on the correlation coefficient (R) and F-ratio were considered. Two models were suggested, one model having four descriptors ($R^2$ = 0.894, $Q^2_{cv}$ = 0.900, F = 172.1) and other model involving 13 descriptors ($R^2$ = 0.956, $Q^2_{cv}$ = 0.956, F = 125.4).