• Title/Summary/Keyword: Multiple Linear Regression(MLR)

Search Result 124, Processing Time 0.021 seconds

Prediction of Retention Time for PAH Molecule in HPLC (고속액체 크로마토그래피에서 PAH분자의 구조에 따른 용리시간 예측)

  • Kim, Young-Gu
    • Journal of the Korean Chemical Society
    • /
    • v.44 no.2
    • /
    • pp.102-108
    • /
    • 2000
  • Relative retention times (RRTs) of RAH molecules in HPLC are trained and predicted intesting sets using a multiple linear regression (NLR) and an artificial neural network (ANN). The maindescriptors in QSRR are molecular connectivity ($^1X_v,\;^2X_v$), the length-to-breadth ratios (L/B), and molecular dipole moment(D). L/B which is related with slot model is a good descripter in ANN, but isn't in MLR. Varainces which show the accuracy of prediction times in testing sets are 0.0099, 0.0114 for ANN and MLR, respectively. It was shown that ANN can exceed the MLR in prediction accuracy.

  • PDF

A Study on the Partition Coefficients for Sulfur Compounds Related Composition of LPG (LPG 조성에 따른 황화합물의 분배계수에 관한 연구)

  • Kim, Yeong Gu
    • Journal of the Korean Chemical Society
    • /
    • v.46 no.6
    • /
    • pp.523-527
    • /
    • 2002
  • Partition coefficient related composition of LPG are studied. Analysed sulfur compounds are ethyl mer-captan,n-propyl mercaptan and n-butyl mercaptan. The composition of liquid phase and gas phase in LPG are deter-mined by gas chromatography. The partition coefficient to related the boiling point of sulfur compounds, the temperature and the compositions of solvents, determined by using MLR(multiple linear regression) of SAS is follows; Kpc= $0.61222({\pm}0.6578)-0.04670({\pm}0.000959)Bp+0.26984(\pm0.06504)C4+0.003803(^{\circ}ae0.0019993)Tk,$ N=24, F=14.851, $R^2_{adj}$=0.6437. The boiling points of sulfur compounds at atmospheric pressure and the compositions of LPG effect mostly on partition coefficients. It is presumed that the gas odor elevating effects should be increased, where being on high tem-perature and larger amounts of n-butane.

Evaluation of Surrogate Monitoring Parameters for SS and T-P Using Multiple Linear Regression and Random Forest (다중 선형 회귀 분석과 랜덤 포레스트를 이용한 SS, T-P 대리모니터링 기법 평가)

  • Jeung, Minhyuk;Beom, Jina;Choi, Dongho;Kim, Young-joo;Her, Younggu;Yoon, Kwangsik
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.63 no.2
    • /
    • pp.51-60
    • /
    • 2021
  • Effective nonpoint source (NPS) pollution management requires frequent water quality monitoring, which is, however, often costly to be implemented in practice. Statistical techniques and machine learning methods allow us to identify and focus on fundamental environmental variables that have close relationships with NPS pollutants of interest. This study developed surrogate models to predict the concentrations of suspended sediment (SS) and total phosphorus (T-P) from turbidity and runoff discharge rates using multiple linear regression (MLR) and random forest (RF) methods. The RF models provided acceptable performance in predicting SS and T-P, especially when runoff discharge rates were high. The RF models outperformed the MLR models in all the cases. Such finding highlights the potential of RF techniques and models as a tool to identify fundamental environmental variables that are measured in relatively inexpensive ways or freely available but still able to provide information required to quantify the concentrations of NP S pollutants. The analysis of relative importance rates showed that the temporal variations of SS and T-P concentrations could be more effectively explained by that of turbidity than runoff discharge rate. This study demonstrated that the advanced statistical techniques such as machine learning could help to improve the efficiency of NPS pollutants monitoring.

Comparison of Artificial Neural Network and Empirical Models to Determine Daily Reference Evapotranspiration (기준 일증발산량 산정을 위한 인공신경망 모델과 경험모델의 적용 및 비교)

  • Choi, Yonghun;Kim, Minyoung;O'Shaughnessy, Susan;Jeon, Jonggil;Kim, Youngjin;Song, Weon Jung
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.60 no.6
    • /
    • pp.43-54
    • /
    • 2018
  • The accurate estimation of reference crop evapotranspiration ($ET_o$) is essential in irrigation water management to assess the time-dependent status of crop water use and irrigation scheduling. The importance of $ET_o$ has resulted in many direct and indirect methods to approximate its value and include pan evaporation, meteorological-based estimations, lysimetry, soil moisture depletion, and soil water balance equations. Artificial neural networks (ANNs) have been intensively implemented for process-based hydrologic modeling due to their superior performance using nonlinear modeling, pattern recognition, and classification. This study adapted two well-known ANN algorithms, Backpropagation neural network (BPNN) and Generalized regression neural network (GRNN), to evaluate their capability to accurately predict $ET_o$ using daily meteorological data. All data were obtained from two automated weather stations (Chupungryeong and Jangsu) located in the Yeongdong-gun (2002-2017) and Jangsu-gun (1988-2017), respectively. Daily $ET_o$ was calculated using the Penman-Monteith equation as the benchmark method. These calculated values of $ET_o$ and corresponding meteorological data were separated into training, validation and test datasets. The performance of each ANN algorithm was evaluated against $ET_o$ calculated from the benchmark method and multiple linear regression (MLR) model. The overall results showed that the BPNN algorithm performed best followed by the MLR and GRNN in a statistical sense and this could contribute to provide valuable information to farmers, water managers and policy makers for effective agricultural water governance.

Assessment through Statistical Methods of Water Quality Parameters(WQPs) in the Han River in Korea

  • Kim, Jae Hyoun
    • Journal of Environmental Health Sciences
    • /
    • v.41 no.2
    • /
    • pp.90-101
    • /
    • 2015
  • Objective: This study was conducted to develop a chemical oxygen demand (COD) regression model using water quality monitoring data (January, 2014) obtained from the Han River auto-monitoring stations. Methods: Surface water quality data at 198 sampling stations along the six major areas were assembled and analyzed to determine the spatial distribution and clustering of monitoring stations based on 18 WQPs and regression modeling using selected parameters. Statistical techniques, including combined genetic algorithm-multiple linear regression (GA-MLR), cluster analysis (CA) and principal component analysis (PCA) were used to build a COD model using water quality data. Results: A best GA-MLR model facilitated computing the WQPs for a 5-descriptor COD model with satisfactory statistical results ($r^2=92.64$,$Q{^2}_{LOO}=91.45$,$Q{^2}_{Ext}=88.17$). This approach includes variable selection of the WQPs in order to find the most important factors affecting water quality. Additionally, ordination techniques like PCA and CA were used to classify monitoring stations. The biplot based on the first two principal components (PCs) of the PCA model identified three distinct groups of stations, but also differs with respect to the correlation with WQPs, which enables better interpretation of the water quality characteristics at particular stations as of January 2014. Conclusion: This data analysis procedure appears to provide an efficient means of modelling water quality by interpreting and defining its most essential variables, such as TOC and BOD. The water parameters selected in a COD model as most important in contributing to environmental health and water pollution can be utilized for the application of water quality management strategies. At present, the river is under threat of anthropogenic disturbances during festival periods, especially at upstream areas.

Multivariate Statistical Analysis Approach to Predict the Reactor Properties and the Product Quality of a Direct Esterification Reactor for PET Synthesis (다변량 통계분석법을 이용한 PET 중합공정 중 직접 에스테르화 반응기의 거동 및 생산제품 예측)

  • Kim Sung Young;Chung Chang Bock;Choi Soo Hyoung;Lee Bomsock;Lee Bomsock
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.6
    • /
    • pp.550-557
    • /
    • 2005
  • The multivariate statistical analysis methods, using both multiple linear regression(MLR) and partial least square(PLS), have been applied to predict the reactor properties and the product quality of a direct esterification reactor for polyethylene terephthalate(PET) synthesis. On the basis of the set of data including the flow rate of water vapor, the flow rate of EG vapor, the concentration of acid end groups of a product and other operating conditions such as temperature, pressure, reaction times and feed monomer mole ratio, two multi-variable analysis methods have been applied. Their regression and prediction abilities also have been compared. The prediction results are critically compared with the actual plant data and the other mathematical model based results in reliability. This paper shows that PLS method approach can be used for the reasonably accurate prediction of a product quality of a direct esterification reactor in PET synthesis process.

Nondestructive Determination of Humic Acids in Soils by Near Infrared Reflectance Spectroscopy

  • Seo, Sang-Hyun;Park, Woo-Churl;Cho, Rae-Kwang;Xiaori Han
    • Near Infrared Analysis
    • /
    • v.1 no.1
    • /
    • pp.31-35
    • /
    • 2000
  • Near-infrared reflectance spectroscopy(NIRS) was used to determine the humic acids in soil samples from the fields of different crops and land-use over Youngnam and Honam regions in Korea. An InfraAlyzer 500 scanning spectrophotometer was obtained near infrared relectance spectra of soil at 2-nm intervals from 1100 to 2500nm. Multiple linear regression(MLR) or partial least square regression (PLSR) was used to evaluate a NIRS method for the rapid and nondestructive determination of humic acid, fulvic acid and its total contents in soils. The raw spectral data(log 1/R) can be used for estimating humic acid, fulvic acid and its total contents in soil by MLR procedure between the content of a given constituent and the spectral response of several bands. In which the predicted results for fulvic acid is the best in the constituents. The new spectral data are converted from the raw spectra by PLSR method such as the first derivative of each spectrum can also be used to predict humic acid and fulvic acid of the soil samples. A low SEC, SEP and a high coefficient of correlation in the calibration and validation stages enable selection of the best manipulation. But a simple calibration and prediction method for determining humic acid and fulvic acid should be selected under similar accuracy and precision of prediction. NIRS technique may be an effective method for rapid and nondestructive determination for humic acid, fulvic acid and its total contents in soils.

Application of Near Infrared Spectroscopy for Nondestructive Evaluation of Color Degree of Apple Fruit (사과 착색도의 비파괴측정을 위한 근적외분광분석법의 응용)

  • Sohn, Mi-Ryeong;Cho, Rae-Kwang
    • Food Science and Preservation
    • /
    • v.7 no.2
    • /
    • pp.155-159
    • /
    • 2000
  • Apple fruit grading is largely dependant on skin color degree. This work reports about the possibility of nondestructive assessment of apple fruit color using infrared(NIR) reflectance spectroscopy. NIR spectra of apple fruit were collected in wavelength range of 1100~2500nm using an InfraAlyzer 500C(Bran+Luebbe). Calibration as calculated by the standard analysis procedures MLR(multiple linear regression) and stepwise, was performed by allowing the IDAS software to select the best regression equations using raw spectra of sample. Color degree of apple skin was expressed as 2 factors, anthocyanin content by purification and a-value by colorimeter. A total of 90 fruits was used for the calibration set(54) and prediction set(36). For determining a-value, the calibration model composed 6 wavelengths(2076, 2120, 2276, 2488, 2072 and 1492nm) provided the highest accuracy : correlation coefficient is 0.913 and standard error of prediction is 4.94. But, the accuracy of prediction result for anthocyanin content determining was rather low(R of 0.761).

  • PDF

A Study of Piping Leadtime Forecast in Offshore Plant’s Outfittings Procurement Management (해양플랜트 의장품 조달관리를 위한 배관 공정 리드타임 예측 모델에 관한 연구)

  • Ham, Dong Kyun;Back, Myung Gi;Park, Jung Goo;Woo, Jong Hun
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.53 no.1
    • /
    • pp.29-36
    • /
    • 2016
  • In shipbuilding and offshore plant construction, pipe-stools of various types are installed. Moreover, these are many quantities but they must be installed in a successive manner. Due to these characteristics the pipe-stool installation processes easily tends to cause the schedule delays in the overall production processes. In order to reduce delay, the goal of this study is to predicts production’s lead time before manufacturing. Through this predictions it’s expected to reduce total production’s lead time by improving it's process. First of all, we made MLR(Multiple Linear Regression) and PLSR(Partial Least Square Regression) model to predict pipe-spool's lead time and then compared predictability of MLR and PLSR model. If a explanatory variable is added, it will be possible to predict results precisely.

Study of Thiazoline Derivatives for the Design of Optimal Fungicidal Compounds Using Multiple Linear Regression (MLR)

  • Han, Won-Seok;Lee, Jin-Kak;Lee, Jun-Seok;Hahn, Hoh-Gyu;Yoon, Chang-No
    • Bulletin of the Korean Chemical Society
    • /
    • v.33 no.5
    • /
    • pp.1703-1706
    • /
    • 2012
  • Rice blast is the most serious disease of rice due to its harmfulness and its world wide distribution. $Magnaporthe$ $grisea$ is the cause of rice blast disease and destroys rice enough to feed several tens of millions of people each year. Fungicides are commonly used to control rice blast. But $M.$ $grisea$ acquires resistance to chemical treatments by genetic mutations. 2-Phenylimino-1,3-thiazolines were proposed as a novel class of fungicides against $M.$ $grisea$ in the previous study. To develop compounds with a higher biological activity, a new series of 2-phenylimino-1,3-thiazolines was synthesized and its fungicidal activity was determined against $M.$ $grisea$. The QSAR analysis was carried out on a series of 2-phenylimino-1,3-thiazolines. The QSAR results showed the dependence of fungicidal activity on the structural and physicochemical features of 2-phenylimino-1,3-thiazolines. Our results could be used as guidelines for the study of the mode of action and further design of optimal fungicides.