• Title/Summary/Keyword: Partial Least Square Method

Search Result 207, Processing Time 0.026 seconds

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.

Chlorophyll-a Forcasting using PLS Based c-Fuzzy Model Tree (PLS기반 c-퍼지 모델트리를 이용한 클로로필-a 농도 예측)

  • Lee, Dae-Jong;Park, Sang-Young;Jung, Nahm-Chung;Lee, Hye-Keun;Park, Jin-Il;Chun, Meung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.6
    • /
    • pp.777-784
    • /
    • 2006
  • This paper proposes a c-fuzzy model tree using partial least square method to predict the Chlorophyll-a concentration in each zone. First, cluster centers are calculated by fuzzy clustering method using all input and output attributes. And then, each internal node is produced according to fuzzy membership values between centers and input attributes. Linear models are constructed by partial least square method considering input-output pairs remained in each internal node. The expansion of internal node is determined by comparing errors calculated in parent node with ones in child node, respectively. On the other hands, prediction is performed with a linear model haying the highest fuzzy membership value between input attributes and cluster centers in leaf nodes. To show the effectiveness of the proposed method, we have applied our method to water quality data set measured at several stations. Under various experiments, our proposed method shows better performance than conventional least square based model tree method.

A Development of Statistical Model for Pavement Response Model (도로포장 반응모형에 대한 통계모형 개발)

  • Lee, Moon Sup;Park, Hee Mun;Kim, Boo Il;Heo, Tae-Young
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.17 no.5
    • /
    • pp.89-96
    • /
    • 2012
  • The Falling Weight Deflectormeter has been widely used in evaluating the structural adequacy of pavement structures. The deflections measured from the FWD are capable of estimating the stiffness of pavement layers and measuring the pavement responses in the pavement structure. The objective of paper is to develop the pavement response model using a partial least square regression technique based on the FWD deflection data. The partial least square regression method enables to solve the multicollinearity problem occurred in multiple regression model. It is also found that the pavement response model can be developed using the raw data when a partial least square regression was used.

FCM for the Multi-Scale Problems (고속 최소자승 점별계산법을 이용한 멀티 스케일 문제의 해석)

  • 김도완;김용식
    • Proceedings of the Computational Structural Engineering Institute Conference
    • /
    • 2002.10a
    • /
    • pp.599-603
    • /
    • 2002
  • We propose a new meshfree method to be called the fast moving least square reproducing kernel collocation method(FCM). This methodology is composed of the fast moving least square reproducing kernel(FMLSRK) approximation and the point collocation scheme. Using point collocation makes the meshfree method really come true. In this paper, FCM Is shown to be a good method at least to calculate the numerical solutions governed by second order elliptic partial differential equations with geometric singularity or geometric multi-scales. To treat such problems, we use the concept of variable dilation parameter.

  • PDF

A Study on Measurement of Blood Pressure by Partial Least Square Method (부분최소자승법을 이용한 혈압 측정에 관한 연구)

  • Kim, Yong-Joo;Nam, Eun-Hye;Choi, Chang-Hyun;Kim, Jong-Deok
    • Journal of Biosystems Engineering
    • /
    • v.33 no.6
    • /
    • pp.438-445
    • /
    • 2008
  • The purpose of this study was to develop a measurement model based on PLS (Partial least square) method for blood pressures. Measurement system for blood pressure signals consisted of pressure sensor, va interface and embedded module. A mercury sphygmomanometer was connected with the measurement system through 3-way stopcock and used as reference of blood pressures. The blood pressure signals of 20 subjects were measured and tests were repeated 5 times per each subject. Total of 100 data were divided into a calibration set and a prediction set. The PLS models were developed to determine the systolic and the diastolic blood pressures. The PLS models were evaluated by the standard methods of the British Hypertension Society (BHS) protocol and the American Association for the Advancement of Medical Instrumentation (AAMI). The results of the PLS models were compared with those of MAA (maximum amplitude algorithm). The measured blood pressures with PLS method were highly correlated to those with a mercury sphygmomanometer in the systolic ($R^2=0.85$) and the diastolic blood pressure ($R^2=0.84$). The results showed that the PLS models were the effective tools for blood pressure measurements with high accuracy, and satisfied the standards of the BHS protocol and the AAMI.

A Method for Screening Product Design Variables for Building A Usability Model : Genetic Algorithm Approach (사용편의성 모델수립을 위한 제품 설계 변수의 선별방법 : 유전자 알고리즘 접근방법)

  • Yang, Hui-Cheol;Han, Seong-Ho
    • Journal of the Ergonomics Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.45-62
    • /
    • 2001
  • This study suggests a genetic algorithm-based partial least squares (GA-based PLS) method to select the design variables for building a usability model. The GA-based PLS uses a genetic algorithm to minimize the root-mean-squared error of a partial least square regression model. A multiple linear regression method is applied to build a usability model that contains the variables seleded by the GA-based PLS. The performance of the usability model turned out to be generally better than that of the previous usability models using other variable selection methods such as expert rating, principal component analysis, cluster analysis, and partial least squares. Furthermore, the model performance was drastically improved by supplementing the category type variables selected by the GA-based PLS in the usability model. It is recommended that the GA-based PLS be applied to the variable selection for developing a usability model.

  • PDF

The Development of a Fault Diagnosis Model based on the Parameter Estimations of Partial Least Square Models (부분최소제곱법 모델의 파라미터 추정을 이용한 화학공정의 이상진단 모델 개발)

  • Lee, Kwang Oh;Lee, Chang Jun
    • Journal of the Korean Society of Safety
    • /
    • v.34 no.4
    • /
    • pp.59-67
    • /
    • 2019
  • Since it is really hard to construct process models based on prior process knowledges, various statistical approaches have been employed to build fault diagnosis models. However, the crucial drawback of these approaches is that the solutions may vary according to the fault magnitude, even if the same fault occurs. In this study, the parameter monitoring approach is suggested. When a fault occurs in a chemical process, this leads to trigger the change of a process model and the monitoring parameters of process models is able to provide the efficient fault diagnosis model. A few important variables are selected and their predictive models are constructed by partial least square (PLS) method. The Euclidean norms of parameters of PLS models are estimated and a fault diagnosis can be performed as comparing with parameters of PLS models based on normal operational conditions. To improve the monitoring performance, cumulative summation (CUSUM) control chart is employed and the changes of model parameters are recorded to identify the type of an unknown fault. To verify the efficacy of the proposed model, Tennessee Eastman (TE) process is tested and this model can be easily applied to other complex processes.

Nondestructive Quantification of Intact Ambroxol Tablet using Near-infrared Spectroscopy (근적외분광분석법을 사용한 암브록솔 정제의 비파괴적 정량분석)

  • 임현량;우영아;김도형;김효진;강신정;최현철;최한곤
    • YAKHAK HOEJI
    • /
    • v.48 no.1
    • /
    • pp.60-64
    • /
    • 2004
  • Near-infrared (NIR) spectroscopy was used to determine rapidly and nondestructively the content of ambroxol in intact ambroxol tablets containing 30 mg (12.5% m/m nominal concentration) by collecting NIR spectra in range 1100-1750 nm. The laboratory-made samples had 10.3∼15.9% m/m nominal ambroxol concentration. The measurements were made by reflection using a fiber-optic probe and calibration was carried out by partial least square regression (PLSR) with autoscaling. Model validation was performed by randomly splitting the data set into calibration and validation data set (7 samples as a calibration data set and 5 samples as a validation data set). The developed NIR method gave results comparable to the known values of tablets in a laboratorial manufacturing Process, standard error of calibration (SEC) and standard error of prediction (SEP) being 0.49% and 0.49% m/m respectively. The method showed good accuracy and repeatability NIR spectroscopic determination in intact tablets allowed the potential use of real time monitoring for a running production process.

Multivariate Procedure for Variable Selection and Classification of High Dimensional Heterogeneous Data

  • Mehmood, Tahir;Rasheed, Zahid
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.6
    • /
    • pp.575-587
    • /
    • 2015
  • The development in data collection techniques results in high dimensional data sets, where discrimination is an important and commonly encountered problem that are crucial to resolve when high dimensional data is heterogeneous (non-common variance covariance structure for classes). An example of this is to classify microbial habitat preferences based on codon/bi-codon usage. Habitat preference is important to study for evolutionary genetic relationships and may help industry produce specific enzymes. Most classification procedures assume homogeneity (common variance covariance structure for all classes), which is not guaranteed in most high dimensional data sets. We have introduced regularized elimination in partial least square coupled with QDA (rePLS-QDA) for the parsimonious variable selection and classification of high dimensional heterogeneous data sets based on recently introduced regularized elimination for variable selection in partial least square (rePLS) and heterogeneous classification procedure quadratic discriminant analysis (QDA). A comparison of proposed and existing methods is conducted over the simulated data set; in addition, the proposed procedure is implemented to classify microbial habitat preferences by their codon/bi-codon usage. Five bacterial habitats (Aquatic, Host Associated, Multiple, Specialized and Terrestrial) are modeled. The classification accuracy of each habitat is satisfactory and ranges from 89.1% to 100% on test data. Interesting codon/bi-codons usage, their mutual interactions influential for respective habitat preference are identified. The proposed method also produced results that concurred with known biological characteristics that will help researchers better understand divergence of species.

Discrimination of Alismatis Rhizoma According to Geographical Origins using Near Infrared Spectroscopy (근적외선분광법을 이용한 택사의 산지 판별법 연구)

  • Lee, Dong Young;Kim, Seung Hyun;Kim, Hyo Jin;Sung, Sang Hyun
    • Korean Journal of Pharmacognosy
    • /
    • v.44 no.4
    • /
    • pp.344-349
    • /
    • 2013
  • Near infrared spectroscopy (NIRS) combined with multivariate analysis was used to discriminate the geographical origin of Alisma orientale from Korea (n=94) and China (n=72). Two-thirds of samples were selected randomly for the training set, and one-third of samples for the test set. Second derivative was used for the pretreatment of NIR spectra. Partial least square discriminant analysis (PLS-DA) models correctly discriminated 100% of the Korean and Chinese A. orientale samples. These results demonstrate the potential use of NIR spectroscopy combined with multivariate analysis as a rapid and accurate method to discriminate A. orientale according to their geographical origin.