• Title/Summary/Keyword: Partial least squares (PLS)

Search Result 383, Processing Time 0.029 seconds

Differentiation of Roots of Glycyrrhiza Species by 1H Nuclear Magnetic Resonance Spectroscopy and Multivariate Statistical Analysis

  • Yang, Seung-Ok;Hyun, Sun-Hee;Kim, So-Hyun;Kim, Hee-Su;Lee, Jae-Hwi;Whang, Wan-Kyun;Lee, Min-Won;Choi, Hyung-Kyoon
    • Bulletin of the Korean Chemical Society
    • /
    • v.31 no.4
    • /
    • pp.825-828
    • /
    • 2010
  • To classify Glycyrrhiza species, samples of different species were analyzed by $^1H$ NMR-based metabolomics technique. Partial least squares discriminant analysis (PLS-DA) was used as the multivariate statistical analysis of the 1H NMR data sets. There was a clear separation between various Glycyrrhiza species in the PLS-DA derived score plots. The PLS-DA model was validated, and the key metabolites contributing to the separation in the score plots of various Glycyrrhiza species were lactic acid, alanine, arginine, proline, malic acid, asparagine, choline, glycine, glucose, sucrose, 4-hydroxy-phenylacetic acid, and formic acid. The compounds present at relatively high levels were glucose, and 4-hydroxyphenylacetic acid in G. glabra; lactic acid, alanine, and proline in G. inflata; and arginine, malic acid, and sucrose in G. uralensis. This is the first study to perform the global metabolomic profiling and differentiation of Glycyrrhiza species using $^1H$ NMR and multivariate statistical analysis.

A Statistical Approach to Screening Product Design Variables for Modeling Product Usability (사용편의성에 영향을 미치는 제품 설계 변수의 통계적 선별 방법)

  • Kim, Jong-Seo;Han, Seong-Ho
    • Journal of the Ergonomics Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.23-37
    • /
    • 2000
  • Usability is one of the most important factors that affect customers' decision to purchase a product. Several studies have been conducted to model the relationship between the product design variables and the product usability. Since there could be hundreds of design variables to be considered in the model, a variable screening method is required. Traditional variable screening methods are based on expert opinions (Expert screening) in most Kansei engineering studies. Suggested in this study are statistical methods for screening important design variables by using the principal component regression(PCR), cluster analysis, and partial least squares(PLS) method. Product variables with high effect (PCR screening and PLS screening) or representative variables (Cluster screening) can be used to model the usability. Proposed variable screening methods are used to model the usability for 36 audio/visual products. The three analysis methods (PCR, Cluster, and PLS) show better model performance than the Expert screening in terms of $R^2$, the number of variables in the model, and PRESS. It is expected that these methods can be used for screening the product design variables efficiently.

  • PDF

Compositional Analysis of Naphtha by FT-Raman Spectroscopy

  • 구민식;정호일
    • Bulletin of the Korean Chemical Society
    • /
    • v.20 no.2
    • /
    • pp.159-162
    • /
    • 1999
  • Three different chemical compositions of total paraffin, total naphthene, total aromatic content in naphtha have been successfully analyzed using FT-Raman spectroscopy. Partial least squares (PLS) regression has been utilized to develop calibration models for each composition from Raman spectral bands. The PLS calibration results showed Blood correlation with those of gas chromatography (GC). Using PLS regression, the spectral information related to each composition has been successfully extracted from highly overlapped Raman spectra of naphtha.

Predicting the Soluble Solids of Apples by Near Infrared Spectroscopy (II) - PLS and ANN Models - (근적외선을 이용한 사과의 당도예측 (II) - 부분최소제곱 및 인공신경회로망 모델 -)

  • ;W. R. Hruschka;J. A. Abbott;;B. S. Park
    • Journal of Biosystems Engineering
    • /
    • v.23 no.6
    • /
    • pp.571-582
    • /
    • 1998
  • The PLS(Partial Least Square) and ANN(Artificial Neural Network) were introduced to develop the soluble solids content prediction model of apples which is followed by making a subsequent selection of photosensor. For the optimal PLS model, number of factors needed for spectrum analysis were increased until the convergence of prediction residual error sum of squares. Analysis has shown that even part of the overall wavelength with no pretreatment may turn out better performing. The best PLS model was found in the 800 to 1,100nm wavelength region without pretreatment of second derivation, having $R^2$=0.9236, bias= -0.0198bx, SEP=0.2527bx for unknown samples. On the other hand, for the ANN model the second derivation led to higher performance. On partial range of 800 to 1,100nm wavelengh region, prediction model with second derivation for unknown samples reached $R^2$=0.9177, SEP=0.2903bx in contrast to $R^2$=0.7507, SEP =0.4622bx without pretreatment.

  • PDF

Discrimination Model of Cultivation Area of Alismatis Rhizoma using a GC-MS-Based Metabolomics Approach (GC-MS 기반 대사체학 기법을 이용한 택사의 산지판별모델)

  • Leem, Jae-Yoon
    • YAKHAK HOEJI
    • /
    • v.60 no.1
    • /
    • pp.29-35
    • /
    • 2016
  • Traditional Korean medicines may be managed more scientifically, through the development of logical criterion to verify their cultivation region. It contributes to advance the industry of traditional herbal medicines. Volatile compounds were obtained from 14 samples of domestic Taeksa and 30 samples of Chinese Taeksa by steam distillation. The metabolites were identified by NIST mass spectral library in the obtained gas chromatography/mass spectrometer (GC/MS) data of 35 training samples. The multivariate statistical analysis, such as Principal Component Analysis (PCA), Partial Least Squares Discriminant Analysis (PLS-DA), and Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA), were performed based on the qualitative and quantitative data. Finally trans-(2,3-diphenylcyclopropyl)methyl phenyl sulfoxide (47.265 min), 1,2,3,4-tetrahydro-1-phenyl-naphthalene (47.781 min), spiro[4-oxatricyclo[5.3.0.0.(2,6)]decan-3-one-5,2'-cyclohexane] (54.62 min), 6-[7-nitrobenzofurazan-4-yl]amino-morphinan-4,5-epoxy (54.86 min), p-hydroxynorephedrine (55.14 min) were determined as marker metabolites to verify candidates for the origin of Taeksa. The statistical model was well established to determine the origin of Taeksa. The cultivation areas of test samples, each 3 domestic and 6 Chinese Taeksa were predicted by the established OPLS-DA model and it was confirmed that all 9 samples were precisely classified.

Partial Least Squares Based Gene Expression Analysis in EBV-Positive and EBV-Negative Posttransplant Lymphoproliferative Disorders

  • Wu, Sa;Zhang, Xin;Li, Zhi-Ming;Shi, Yan-Xia;Huang, Jia-Jia;Xia, Yi;Yang, Hang;Jiang, Wen-Qi
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.14 no.11
    • /
    • pp.6347-6350
    • /
    • 2013
  • Post-transplant lymphoproliferative disorder (PTLD) is a common complication of therapeutic immunosuppression after organ transplantation. Gene expression profile facilitates the identification of biological difference between Epstein-Barr virus (EBV) positive and negative PTLDs. Previous studies mainly implemented variance/regression analysis without considering unaccounted array specific factors. The aim of this study is to investigate the gene expression difference between EBV positive and negative PTLDs through partial least squares (PLS) based analysis. With a microarray data set from the Gene Expression Omnibus database, we performed PLS based analysis. We acquired 1188 differentially expressed genes. Pathway and Gene Ontology enrichment analysis identified significantly over-representation of dysregulated genes in immune response and cancer related biological processes. Network analysis identified three hub genes with degrees higher than 15, including CREBBP, ATXN1, and PML. Proteins encoded by CREBBP and PML have been reported to be interact with EBV before. Our findings shed light on expression distinction of EBV positive and negative PTLDs with the hope to offer theoretical support for future therapeutic study.

Endpoint Detection Using Hybrid Algorithm of PLS and SVM (PLS와 SVM복합 알고리즘을 이용한 식각 종료점 검출)

  • Lee, Yun-Keun;Han, Yi-Seul;Hong, Sang-Jeen;Han, Seung-Soo
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.24 no.9
    • /
    • pp.701-709
    • /
    • 2011
  • In semiconductor wafer fabrication, etching is one of the most critical processes, by which a material layer is selectively removed. Because of difficulty to correct a mistake caused by over etching, it is critical that etch should be performed correctly. This paper proposes a new approach for etch endpoint detection of small open area wafers. The traditional endpoint detection technique uses a few manually selected wavelengths, which are adequate for large open areas. As the integrated circuit devices continue to shrink in geometry and increase in device density, detecting the endpoint for small open areas presents a serious challenge to process engineers. In this work, a high-resolution optical emission spectroscopy (OES) sensor is used to provide the necessary sensitivity for detecting subtle endpoint signal. Partial Least Squares (PLS) method is used to analyze the OES data which reduces dimension of the data and increases gap between classes. Support Vector Machine (SVM) is employed to detect endpoint using the data after PLS. SVM classifies normal etching state and after endpoint state. Two data sets from OES are used in training PLS and SVM. The other data sets are used to test the performance of the model. The results show that the trained PLS and SVM hybrid algorithm model detects endpoint accurately.

Applications of Discrete Wavelet Analysis for Predicting Internal Quality of Cherry Tomatoes using VIS/NIR Spectroscopy

  • Kim, Ghiseok;Kim, Dae-Yong;Kim, Geon Hee;Cho, Byoung-Kwan
    • Journal of Biosystems Engineering
    • /
    • v.38 no.1
    • /
    • pp.48-54
    • /
    • 2013
  • Purpose: This study evaluated the feasibility of using a discrete wavelet transform (DWT) method as a preprocessing tool for visible/near-infrared spectroscopy (VIS/NIRS) with a spectroscopic transmittance dataset for predicting the internal quality of cherry tomatoes. Methods: VIS/NIRS was used to acquire transmittance spectrum data, to which a DWT was applied to generate new variables in the wavelet domain, which replaced the original spectral signal for subsequent partial least squares (PLS) regression analysis and prediction modeling. The DWT concept and its importance are described with emphasis on the properties that make the DWT a suitable transform for analyzing spectroscopic data. Results: The $R^2$ values and root mean squared errors (RMSEs) of calibration and prediction models for the firmness, sugar content, and titratable acidity of cherry tomatoes obtained by applying the DWT to a PLS regression with a set of spectra showed more enhanced results than those of each model obtained from raw data and mean normalization preprocessing through PLS regression. Conclusions: The developed DWT-incorporated PLS models using the db5 wavelet base and selected approximation coefficients indicate their feasibility as good preprocessing tools by improving the prediction of firmness and titratable acidity for cherry tomatoes with respect to $R^2$ values and RMSEs.

Rapid Prediction of Amylose Content of Polished Rice by Fourier Transform Near-Infrared Spectroscopy

  • Lee, Jin-Cheol;Yoon, Yeon-Hee;Kim, Sun-Min;Pyo, Byong-Sik;Hsieh, Fu-Hung;Kim, Hak-Jin;Eun, Jong-Bang
    • Food Science and Biotechnology
    • /
    • v.16 no.3
    • /
    • pp.477-481
    • /
    • 2007
  • Fourier transform near-infrared (FT-NIR) spectroscopy and partial least squares (PLS) regression were used to predict the amylose content of polished rice. Spectral reflectance data in a wavelength range of 1,000 to 2,500 nm were obtained with a commercial spectrophotometer for 60 different varieties of Korean rice. For a comparison of this spectroscopic method to a standard chemical analysis, the amylose contents of the tested rice samples were determined by the iodine-blue colorimetric method. The highest correlation for the rice amylose ($R^2=0.94$, standard error of prediction=0.20% amylose content) was obtained when using the FT-NIR spectrum data pre-treated with normalization, the first derivative, smoothing, and scattering correction.

Determinants of E-Government Assimilation in Indonesia: An Empirical Investigation Using a TOE Framework

  • Pudjianto, Boni;Zo, Hangjung;Ciganek, Andrew P.;Rho, Jae-Jeung
    • Asia pacific journal of information systems
    • /
    • v.21 no.1
    • /
    • pp.49-80
    • /
    • 2011
  • E-government needs to be successfully implemented and assimilated into organizations to take advantage of its potential values and benefits for organizations. This study examines factors for e-government assimilation in Indonesia and employs the TOE (Technology-Organization-Environment) framework to develop a theoretical model to explain e-government assimilation. It also investigates how organizational type (central vs. local) plays a role in the assimilation of e-government. One hundred eighteen respondents from the central and local governments in Indonesia participated in the survey and an in-depth analysis based on partial least squares (PLS) was carried out. The results show that ICT infrastructure has the strongest significant relationship with e-government assimilation, Top management support, regulatory environment, ICT expertise, and competitive environment are also significant factors to explain e-government assimilation in Indonesia. Central and local governments Significantly differ in terms of e-government assimilation, so organizational type can be a moderator in the process of e-government assimilation. These findings present the efficacy of the proposed model for analyzing e-government assimilation and contribute additional insights for academia as well as practitioners and policy makers.