• Title/Summary/Keyword: Partial Least-Squares

Search Result 605, Processing Time 0.025 seconds

Molecular modeling of COX-2 inhibitors: 3D-QSAR and docking studies

  • Kim, Hye-Jung;Chae, Chong-Hak;Yoo, Sung-Eun;Yi, Kyu-Yang;Park, Kyung-Lae
    • Proceedings of the PSK Conference
    • /
    • 2003.10b
    • /
    • pp.65.2-65.2
    • /
    • 2003
  • 88 selective COX-2 inhibitors belonging to three chemical classes (triaryl rings, diaryl cycloalkanopyrazoles, and diphenyl hydrazides) were studied using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). Partial least squares analysis produced statistically significant models with q values of 0.84 and 0.79 for CoMFA and CoMSIA, respectively. The key spatial properties were detected by careful analysis of the isocontour maps. The binding energies calculated from flexible docking correlated with inhibitory activities by the least-squares fit method. (omitted)

  • PDF

Multivariate Procedure for Variable Selection and Classification of High Dimensional Heterogeneous Data

  • Mehmood, Tahir;Rasheed, Zahid
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.6
    • /
    • pp.575-587
    • /
    • 2015
  • The development in data collection techniques results in high dimensional data sets, where discrimination is an important and commonly encountered problem that are crucial to resolve when high dimensional data is heterogeneous (non-common variance covariance structure for classes). An example of this is to classify microbial habitat preferences based on codon/bi-codon usage. Habitat preference is important to study for evolutionary genetic relationships and may help industry produce specific enzymes. Most classification procedures assume homogeneity (common variance covariance structure for all classes), which is not guaranteed in most high dimensional data sets. We have introduced regularized elimination in partial least square coupled with QDA (rePLS-QDA) for the parsimonious variable selection and classification of high dimensional heterogeneous data sets based on recently introduced regularized elimination for variable selection in partial least square (rePLS) and heterogeneous classification procedure quadratic discriminant analysis (QDA). A comparison of proposed and existing methods is conducted over the simulated data set; in addition, the proposed procedure is implemented to classify microbial habitat preferences by their codon/bi-codon usage. Five bacterial habitats (Aquatic, Host Associated, Multiple, Specialized and Terrestrial) are modeled. The classification accuracy of each habitat is satisfactory and ranges from 89.1% to 100% on test data. Interesting codon/bi-codons usage, their mutual interactions influential for respective habitat preference are identified. The proposed method also produced results that concurred with known biological characteristics that will help researchers better understand divergence of species.

Prediction of Chemical Compositions for On-line Quality Measurement of Red Pepper Powder Using Near Infrared Reflectance Spectroscopy (NIRS)

  • Lee, Sun-Mee;Kim, Su-Na;Park, Jae-Bok;Hwang, In-Kyeong
    • Food Science and Biotechnology
    • /
    • v.14 no.2
    • /
    • pp.280-285
    • /
    • 2005
  • Applicability of near infrared reflectance spectroscopy (NIRS) was examined for quality control of red pepper powder in milling factories. Prediction of chemical composition was performed using modified partial least square (MPLS) techniques. Analysis of total 51 and 21 red pepper powder samples by conventional methods for calibration and validation, respectively, revealed standard error of prediction (SEP) and correlation coefficient ($R^2$) of moisture content, ASTA color value, capsaicinoid content, and total sugar content were 0.55 and 0.90, 8.58 and 0.96, 31.60 and 0.65, and 1.82 and 0.86, respectively; SEP and $R^2$ were low and high, respectively, except for capsaicinoid content. The results indicate, with slight improvement, on-line quality measurement of red pepper powder with NIRS could be applied in red pepper milling factories.

Preprocessing and Calibration of Optical Diffuse Reflectance Signal for Estimation of Soil Physical and Chemical Properties in the Central USA (미국 중부 토양의 이화학적 특성 추정을 위한 광 확산 반사 신호 전처리 및 캘리브레이션)

  • La, Woo-Jung;Sudduth, Kenneth A.;Chung, Sun-Ok;Kim, Hak-Jin
    • Journal of Biosystems Engineering
    • /
    • v.33 no.6
    • /
    • pp.430-437
    • /
    • 2008
  • Optical diffuse reflectance sensing in visible and near-infrared wavelength ranges is one approach to rapidly quantify soil properties for site-specific management. The objectives of this study were to investigate effects of preprocessing of reflectance data and determine the accuracy of the reflectance approach for estimating physical and chemical properties of selected Missouri and Illinois, USA surface soils encompassing a wide range of soil types and textures. Diffuse reflectance spectra of air-dried, sieved samples were obtained in the laboratory. Calibrations relating spectra to soil properties determined by standard methods were developed using partial least squares (PLS) regression. The best data preprocessing, consisting of absorbance transformation and mean centering, reduced estimation errors by up to 20% compared to raw reflectance data. Good estimates ($R^2=0.83$ to 0.92) were obtained using spectral data for soil texture fractions, organic matter, and CEC. Estimates of pH, P, and K were not good ($R^2$ < 0.7), and other approaches to estimating these soil chemical properties should be investigated. Overall, the ability of diffuse reflectance spectroscopy to accurately estimate multiple soil properties across a wide range of soils makes it a good candidate technology for providing at least a portion of the data needed in site-specific management of agriculture.

Discrimination model of cultivation area of Corni Fructus using a GC-MS-Based metabolomics approach (GC-MS 기반 대사체학 기법을 이용한 산수유의 산지판별모델)

  • Leem, Jae-Yoon
    • Analytical Science and Technology
    • /
    • v.29 no.1
    • /
    • pp.1-9
    • /
    • 2016
  • It is believed that traditional Korean medicines can be managed more scientifically through the development of logical criteria to verify their region of cultivation, and that this could contribute to the advancement of the traditional herbal medicine industry. This study attempted to determine such criteria for Sansuyu. The volatile compounds were obtained from 20 samples of domestic Corni fructus (Sansuyu) and 45 samples of Chinese Sansuyu by steam distillation. The metabolites were identified in the NIST Mass Spectral Library via the obtained gas chromatography/mass spectrometer (GC/MS) data of 53 training samples. Data binning at 0.2 min intervals was performed to normalize the number of variables used in the statistical analysis. Multivariate statistical analyses, such as principle component analysis (PCA), partial least squares-discriminant analysis (PLS-DA), and orthogonal partial least squares-discriminant analysis (OPLS-DA) were performed using the SIMCA-P software package. Significant variables with a variable importance in the projection (VIP) score higher than 1.0 were obtained from OPLS-DA, and variables that resulted in a p-value of less than 0.05 through one-way ANOVA were selected to verify the marker compounds. Finally, among the 11 variables extracted, 1-ethylbutyl-hydroperoxide (9.089 min), nonadecane (20.170 min), butylated hydroxytoluene (25.319 min), 5β,7βH,10α-eudesm-11-en-1α-ol (25.921 min), 7,9-bis(2-methyl-2-propanyl)-1-oxaspiro[4.5]deca-6,9-diene-2,8-dione (34.257 min), and 2-decyldodecyl-benzene (54.717 min) were selected as markers to indicate the origin of Sansuyu. The statistical model developed was suitable for the determination of the geographical origin of Sansuyu. The cultivation areas of four Korean and eight Chinese Sansuyu samples were predicted via the established OPLS-DA model, and it was confirmed that 11 of the 12 samples were accurately classified.

Development of Nondestructive Detection Method for Adulterated Powder Products Using Raman Spectroscopy and Partial Least Squares Regression (라만 분광법과 부분최소자승법을 이용한 불량 분말식품 비파괴검사 기술 개발)

  • Lee, Sangdae;Lohumi, Santosh;Cho, Byoung-Kwan;Kim, Moon S.;Lee, Soo-Hee
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.34 no.4
    • /
    • pp.283-289
    • /
    • 2014
  • This study was conducted to develop a non-destructive detection method for adulterated powder products using Raman spectroscopy and partial least squares regression(PLSR). Garlic and ginger powder, which are used as natural seasoning and in health supplement foods, were selected for this experiment. Samples were adulterated with corn starch in concentrations of 5-35%. PLSR models for adulterated garlic and ginger powders were developed and their performances evaluated using cross validation. The $R^2_c$ and SEC of an optimal PLSR model were 0.99 and 2.16 for the garlic powder samples, and 0.99 and 0.84 for the ginger samples, respectively. The variable importance in projection (VIP) score is a useful and simple tool for the evaluation of the importance of each variable in a PLSR model. After the VIP scores were taken pre-selection, the Raman spectrum data was reduced by one third. New PLSR models, based on a reduced number of wavelengths selected by the VIP scores technique, gave good predictions for the adulterated garlic and ginger powder samples.

Determination of water content in alcohol by portable near infrared (NIR) system (휴대용 분광분석기를 이용한 알코올 중에 함유되어 있는 물의 측정)

  • Ahn, Jhii-Weon;Woo, Young-Ah;Kim, Hyo-Jin
    • Analytical Science and Technology
    • /
    • v.16 no.2
    • /
    • pp.95-101
    • /
    • 2003
  • In this study, water content in the mixture of methanol and ethanol was nondestructively measured by near infrared (NIR) spectroscopy. Two types of NIR instruments, portable NIR system with a photo-diode array and scanning type NIR spectrometer were used and the calibration results were compared. Partial least squares regression (PLSR) was applied for the calibration and validation for the quantitative analysis. The calibration results from both instruments showed good correlation with actual values. The calibration with the use of PLS model predicted water concentration with a standard error of prediction (SEP) of 0.10% and 0.12% for photo diode array and scanning type, respectively. During 6 days, routine analyses for 3%, 5% and 7% water in ethanol solution with 2% methanol were performed to validate the robustness of the developed calibration model. The routine analyses showed good results with coefficient of variation (CV) of within 3% for both types of NIR spectrometers. This study showed that the rapid determination of water in the mixture of methanol and ethanol was successfully performed by NIR spectroscopy and the performance of the portable NIR system with a photo diode array detector was comparable to that of the scanning type NIR spectrometer.

The Analysis of R&D Investment Factors for Enhancing the Regional Domestic Competitiveness in China (중국의 지역 내 경쟁력 제고를 위한 R&D 투자요인 분석)

  • Yoon, Daisang;Lee, Jinho;Park, Sang-Hyun
    • Journal of Korea Technology Innovation Society
    • /
    • v.20 no.3
    • /
    • pp.805-836
    • /
    • 2017
  • China has become the group of two (G2) in almost fields including the scientific technology following the economic growth and joining the WTO in 2001. The main reason is that the government had strong intention for the industrialization of the scientific technology and connected the scientific technology and the economy. Typically, for analyzing the cause of the meteoric rise of China, the competitiveness of the scientific technology was analyzed by the entire score of the nation. However, in the case of China, there are differences in the pattern of the development between the eastern, central, and western province. Also, the industrialization and the competitiveness of the scientific technology are difference because each province established the decentralization of power. Therefore, it is more meaningful to analyze the main factors of Chinese economic growth on a province unit. In this study, therefore, we analyzed the competitive of R&D in China by 124 indexes in 31 areas. The data was analyzed by Partial least squares regression analysis. In conclusion, the scale of the area and the ability of R&D of the company are very important factors for total amount of production in the area. And the journals, patents, the transfer of technical know-how and the investment of R&D are main factors of the amount of export on the high-tech product. According to these results, the factors which make the difference in the industrialization and the competitiveness of the scientific technology in China were analyzed. Finally, it will be helpful to establish the policy for the development of the industrialization and the scientific technology in Korea.

Comparison of Partial Least Squares and Support Vector Machine for the Flash Point Prediction of Organic Compounds (유기물의 인화점 예측을 위한 부분최소자승법과 SVM의 비교)

  • Lee, Chang Jun;Ko, Jae Wook;Lee, Gibaek
    • Korean Chemical Engineering Research
    • /
    • v.48 no.6
    • /
    • pp.717-724
    • /
    • 2010
  • The flash point is one of the most important physical properties used to determine the potential for fire and explosion hazards of flammable liquids. Despite the needs of the experimental flash point data for the design and construction of chemical plants, there is often a significant gap between the demands for the data and their availability. This study have built and compared two models of partial least squares(PLS) and support vector machine(SVM) to predict the experimental flash points of 893 organic compounds out of DIPPR 801. As the independent variables of the models, 65 functional groups were chosen based on the group contribution method that was oriented from the assumption that each fragment of a molecule contributes a certain amount to the value of its physical property, and the logarithm of molecular weight was added. The prediction errors calculated from cross-validation were employed to determine the optimal parameters of two models. And, an optimization technique should be used to get three parameters of SVM model. This work adopted particle swarm optimization that is one of heuristic optimization methods. As the selection of training data can affect the prediction performance, 100 data sets of randomly selected data were generated and tested. The PLS and SVM results of the average absolute errors for the whole data range from 13.86 K to 14.55 K and 7.44 K to 10.26 K, respectively, indicating that the predictive ability of the SVM is much superior than PLS.

Evaluating meteorological and hydrological impacts on forest fire occurrences using partial least squares-structural equation modeling: a case of Gyeonggi-do (부분최소제곱 구조방정식모형을 이용한 경기도 지역 산불 발생 요인에 대한 기상 및 수문학적 요인의 영향 분석)

  • Kim, Dongwook;Yoo, Jiyoung;Son, Ho Jun;Kim, Tae-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.3
    • /
    • pp.145-156
    • /
    • 2021
  • Forest fires have frequently occurred around the world, and the damages are increasing. In Korea, most forest fires are initiated by human activities, but climate factors such as temperature, humidity, and wind speed have a great impact on combustion environment of forest fires. In this study, therefore, based on statistics of forest fires in Gyeonggi-do over the past five years, meteorological and hydrological factors (i.e., temperature, humidity, wind speed, precipitation, and drought) were selected in order to quantitatively investigate causal relationships with forest fire. We applied a partial least squares structural equation model (PLS-SEM), which is suitable for analyzing causality and predicting latent variables. The overall results indicated that the measurement and structural models of the PLS-SEM were statistically significant for all evaluation criteria, and meteorological factors such as humidity, temperature, and wind speed affected by amount of -0.42, 0.23 and 0.15 of standardized path coefficient, respectively, on forest fires, whereas hydrological factor such as drought had an effect of 0.23 on forest fires. Therefore, as a practical method, the suggested model can be used for analyzing and evaluating influencing factors of forest fire and also for planning response and preparation of forest fire disasters.