• 제목/요약/키워드: partial least squares regression analysis

검색결과 105건 처리시간 0.028초

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권5호
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.

사용편의성 모델수립을 위한 제품 설계 변수의 선별방법 : 유전자 알고리즘 접근방법 (A Method for Screening Product Design Variables for Building A Usability Model : Genetic Algorithm Approach)

  • 양희철;한성호
    • 대한인간공학회지
    • /
    • 제20권1호
    • /
    • pp.45-62
    • /
    • 2001
  • This study suggests a genetic algorithm-based partial least squares (GA-based PLS) method to select the design variables for building a usability model. The GA-based PLS uses a genetic algorithm to minimize the root-mean-squared error of a partial least square regression model. A multiple linear regression method is applied to build a usability model that contains the variables seleded by the GA-based PLS. The performance of the usability model turned out to be generally better than that of the previous usability models using other variable selection methods such as expert rating, principal component analysis, cluster analysis, and partial least squares. Furthermore, the model performance was drastically improved by supplementing the category type variables selected by the GA-based PLS in the usability model. It is recommended that the GA-based PLS be applied to the variable selection for developing a usability model.

  • PDF

벌점 부분최소자승법을 이용한 분류방법 (A new classification method using penalized partial least squares)

  • 김윤대;전치혁;이혜선
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권5호
    • /
    • pp.931-940
    • /
    • 2011
  • 분류분석은 학습표본으로부터 분류규칙을 도출한 후 새로운 표본에 적용하여 특정 범주로 분류하는 방법이다. 데이터의 복잡성에 따라 다양한 분류분석 방법이 개발되어 왔지만, 데이터 차원이 높고 변수간 상관성이 높은 경우 정확하게 분류하는 것은 쉽지 않다. 본 연구에서는 데이터차원이 상대적으로 높고 변수간 상관성이 높을 때 강건한 분류방법을 제안하고자 한다. 부분최소자승법은 연속형데이터에 사용되는 기법으로서 고차원이면서 독립변수간 상관성이 높을 때 예측력이 높은 통계기법으로 알려져 있는 다변량 분석기법이다. 벌점 부분최소자승법을 이용한 분류방법을 실제데이터와 시뮬레이션을 적용하여 성능을 비교하고자 한다.

Milling tool wear forecast based on the partial least-squares regression analysis

  • Xu, Chuangwen;Chen, Hualing
    • Structural Engineering and Mechanics
    • /
    • 제31권1호
    • /
    • pp.57-74
    • /
    • 2009
  • Power signals resulting from spindle and feed motor, present a rich content of physical information, the appropriate analysis of which can lead to the clear identification of the nature of the tool wear. The partial least-squares regression (PLSR) method has been established as the tool wear analysis method for this purpose. Firstly, the results of the application of widely used techniques are given and their limitations of prior methods are delineated. Secondly, the application of PLSR is proposed. The singular value theory is used to noise reduction. According to grey relational degree analysis, sample variable is filtered as part sample variable and all sample variables as independent variables for modelling, and the tool wear is taken as dependent variable, thus PLSR model is built up through adapting to several experimental data of tool wear in different milling process. Finally, the prediction value of tool wear is compare with actual value, in order to test whether the model of the tool wear can adopt to new measuring data on the independent variable. In the new different cutting process, milling tool wear was predicted by the methods of PLSR and MLR (Multivariate Linear Regression) as well as BPNN (BP Neural Network) at the same time. Experimental results show that the methods can meet the needs of the engineering and PLSR is more suitable for monitoring tool wear.

Pathway and Network Analysis in Glioma with the Partial Least Squares Method

  • Gu, Wen-Tao;Gu, Shi-Xin;Shou, Jia-Jun
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권7호
    • /
    • pp.3145-3149
    • /
    • 2014
  • Gene expression profiling facilitates the understanding of biological characteristics of gliomas. Previous studies mainly used regression/variance analysis without considering various background biological and environmental factors. The aim of this study was to investigate gene expression differences between grade III and IV gliomas through partial least squares (PLS) based analysis. The expression data set was from the Gene Expression Omnibus database. PLS based analysis was performed with the R statistical software. A total of 1,378 differentially expressed genes were identified. Survival analysis identified four pathways, including Prion diseases, colorectal cancer, CAMs, and PI3K-Akt signaling, which may be related with the prognosis of the patients. Network analysis identified two hub genes, ELAVL1 and FN1, which have been reported to be related with glioma previously. Our results provide new understanding of glioma pathogenesis and prognosis with the hope to offer theoretical support for future therapeutic studies.

AI Technology Analysis using Partial Least Square Regression

  • Choi, JunHyeog;Jun, Sunghae
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권3호
    • /
    • pp.109-115
    • /
    • 2020
  • 본 논문에서는 부분 최소 제곱(PLS) 회귀 모형을 이용한 인공지능(AI) 기술 분석을 제안한다. AI 기술은 이제 우리 사회의 대부분의 영역에 영향을 미치고 있다. 따라서 이 기술에 대한 정확한 이해가 필요하게 된다. AI 기술을 분석하기 위하여 전 세계 특허 데이터베이스로부터 AI 관련 특허 문서를 수집하고 텍스트 마이닝 기법을 사용하여 수집된 특허 문서에서 AI 기술 키워드를 추출한다. 본 연구에서는 추출된 AI 키워드 데이터를 PLS 회귀 모형으로 분석한다. 바이오정보학, 사회과학 및 공학 등 다양한 분야에서 고급 데이터 분석을 위하여 사용되는 PLS 회귀 모형은 부분 최소 제곱 기법을 기반으로 한다. 제안 방법의 성능을 확인하기 위하여 AI 특허 문서를 사용하여 분석 실험을 수행하고 제안하는 연구가 실제 문제에 어떻게 적용될 수 있는지 보여 준다. 본 논문은 AI 기술뿐만 아니라 다른 기술 분야에도 적용 할 수 있다.

Partial Least Squares Analysis on Near-Infrared Absorbance Spectra by Air-dried Specific Gravity of Major Domestic Softwood Species

  • Yang, Sang-Yun;Park, Yonggun;Chung, Hyunwoo;Kim, Hyunbin;Park, Se-Yeong;Choi, In-Gyu;Kwon, Ohkyung;Cho, Kyu-Chae;Yeo, Hwanmyeong
    • Journal of the Korean Wood Science and Technology
    • /
    • 제45권4호
    • /
    • pp.399-408
    • /
    • 2017
  • Research on the rapid and accurate prediction of physical properties of wood using near-infrared (NIR) spectroscopy has attracted recent attention. In this study, partial least squares analysis was performed between NIR spectra and air-dried specific gravity of five domestic conifer species including larch (Larix kaempferi), Korean pine (Pinus koraiensis), red pine (Pinus densiflora), cedar (Cryptomeria japonica), and cypress (Chamaecyparis obtusa). Fifty different lumbers per species were purchased from the five National Forestry Cooperative Federations of Korea. The air-dried specific gravity of 100 knot- and defect-free specimens of each species was determined by NIR spectroscopy in the range of 680-2500 nm. Spectral data preprocessing including standard normal variate, detrend and forward first derivative (gap size = 8, smoothing = 8) were applied to all the NIR spectra of the specimens. Partial least squares analysis including cross-validation (five groups) was performed with the air-dried specific gravity and NIR spectra. When the performance of the regression model was expressed as $R^2$ (coefficient of determination) and root mean square error of calibration (RMSEC), $R^2$ and RMSEC were 0.63 and 0.027 for larch, 0.68 and 0.033 for Korean pine, 0.62 and 0.033 for red pine, 0.76 and 0.022 for cedar, and 0.79 and 0.027 for cypress, respectively. For the calibration model, which contained all species in this study, the $R^2$ was 0.75 and the RMSEC was 0.37.

Use of partial least squares analysis in concrete technology

  • Tutmez, Bulent
    • Computers and Concrete
    • /
    • 제13권2호
    • /
    • pp.173-185
    • /
    • 2014
  • Multivariate analysis is a statistical technique that investigates relationship between multiple predictor variables and response variable and it is a very commonly used statistical approach in cement and concrete industry. During model building stage, however, many predictor variables are included in the model and possible collinearity problems between these predictors are generally ignored. In this study, use of partial least squares (PLS) analysis for evaluating the relationships among the cement and concrete properties is investigated. This regression method is known to decrease the model complexity by reducing the number of predictor variables as well as to result in accurate and reliable predictions. The experimental studies showed that the method can be used in the multivariate problems of cement and concrete industry effectively.

열수 탄화 공정을 거친 리그닌 하이드로차(hydrochar)의 탄화 거동 분석과 근적외선 분광법을 이용한 예측 모델 개발 (Analysis of Carbonization Behavior of Hydrochar Produced by Hydrothermal Carbonization of Lignin and Development of a Prediction Model for Carbonization Degree Using Near-Infrared Spectroscopy)

  • HWANG, Un Taek;BAE, Junsoo;LEE, Taekyeong;HWANG, Sung-Yun;KIM, Jong-Chan;PARK, Jinseok;CHOI, In-Gyu;KWAK, Hyo Won;HWANG, Sung-Wook;YEO, Hwanmyeong
    • Journal of the Korean Wood Science and Technology
    • /
    • 제49권3호
    • /
    • pp.213-225
    • /
    • 2021
  • 본 논문에서는 열수 탄화(hydrothermal carbonization)에 의해 제조된 리그닌 하이드로차의 탄화 특성을 조사하였고, 근적외선 분광법과 부분 최소 제곱(partial least squares) 회귀를 이용하여 탄화 거동을 예측하기 위한 모델을 수립하였다. 온도 200℃에서 열수 탄화된 리그닌의 탄소 함량은 무처리 시료 보다 약 3 wt% 높았으며 가열 시간이 증가할수록 탄소 함량도 서서히 증가하는 경향이 나타났다. 열수 탄화는 리그닌을 더욱 탄소 집약적으로 변화시키고 마이크로 파티클을 제거하여 더욱 균질한 특성을 부여하였다. 근적외선 분광법과 부분 최소 제곱 회귀를 이용한 판별 및 예측 모델은 수열 탄화의 적용 여부를 완벽히 구분했으며 높은 정확도로 열수 탄화 리그닌의 탄소 함량을 예측하였다. 본 연구로부터 근적외선 분광법과 결합된 부분 최소 제곱 회귀 모델을 이용하여 열수 탄화에 의해 제조된 리그닌 하이드로차의 탄화 특성을 빠르고 비파괴적으로 예측할 수 있다는 것이 확인되었다.

Investigation of Partial Least Squares (PLS) Calibration Performance based on Different Resolutions of Near Infrared Spectra

  • Chung, Hoe-Il;Choi, Seung-Yeol;Choo, Jae-Bum;Lee, Young-Il
    • Bulletin of the Korean Chemical Society
    • /
    • 제25권5호
    • /
    • pp.647-651
    • /
    • 2004
  • Partial Least Squares (PLS) calibration performance has been systematically investigated by changing spectral resolutions of near-infrared (NIR) spectra. For this purpose, synthetic samples simulating naphtha were prepared to examine the calibration performance in complex chemical matrix. These samples were composed of $C_6-C_9$ normal paraffin, iso-paraffin, naphthene, and aromatic hydrocarbons. NIR spectra with four different resolutions of 4, 8, 16, and 32$cm^{-1}$ were collected and then PLS regression was performed. For PLS calibration, five different group compositions (such as total paraffin content) and six different pure components (such as benzene concentration) were selected. The overall results showed that at least 8$cm^{-1}$ resolution was required to resolve the complex chemical matrix such as naphtha. It was found that the influence of resolution on the PLS calibration was varied by the spectral features of a component.