• 제목/요약/키워드: modified partial least squares

검색결과 43건 처리시간 0.029초

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권5호
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.

Hybrid Linear Analysis Based on the Net Analyte Signal in Spectral Response with Orthogonal Signal Correction

  • Park, Kwang-Su;Jun, Chi-Hyuck
    • Near Infrared Analysis
    • /
    • 제1권2호
    • /
    • pp.1-8
    • /
    • 2000
  • Using the net analyte signal, hybrid linear analysis was proposed to predict chemical concentration. In this paper, we select a sample from training set and apply orthogonal signal correction to obtain an improved pseudo unit spectrum for hybrid least analysis. using the mean spectrum of a calibration training set, we first show the calibration by hybrid least analysis is effective to the prediction of not only chemical concentrations but also physical property variables. Then, a pseudo unit spectrum from a training set is also tested with and without orthogonal signal correction. We use two data sets, one including five chemical concentrations and the other including ten physical property variables, to compare the performance of partial least squares and modified hybrid least analysis calibration methods. The results show that the hybrid least analysis with a selected training spectrum instead of well-measured pure spectrum still gives good performances, which is a little better than partial least squares.

Modified partial least squares method implementing mixed-effect model

  • Kyunga Kim;Shin-Jae Lee;Soo-Heang Eo;HyungJun Cho;Jae Won Lee
    • Communications for Statistical Applications and Methods
    • /
    • 제30권1호
    • /
    • pp.65-73
    • /
    • 2023
  • Contemporary biomedical data often involve an ill-posed problem owing to small sample size and large number of multi-collinear variables. Partial least squares (PLS) method could be a plausible alternative to an ill-conditioned ordinary least squares. However, in the case of a PLS model that includes a random-effect, how to deal with a random-effect or mixed effects remains a widely open question worth further investigation. In the present study, we propose a modified multivariate PLS method implementing mixed-effect model (PLSM). The advantage of PLSM is its versatility in handling serial longitudinal data or its ability for taking a randomeffect into account. We conduct simulations to investigate statistical properties of PLSM, and showcase its real clinical application to predict treatment outcome of esthetic surgical procedures of human faces. The proposed PLSM seemed to be particularly beneficial 1) when random-effect is conspicuous; 2) the number of predictors is relatively large compared to the sample size; 3) the multicollinearity is weak or moderate; and/or 4) the random error is considerable.

Determination of Protein Content in Pea by Near Infrared Spectroscopy

  • Lee, Jin-Hwan;Choung, Myoung-Gun
    • Food Science and Biotechnology
    • /
    • 제18권1호
    • /
    • pp.60-65
    • /
    • 2009
  • Near infrared reflectance spectroscopy (NIRS) was used as a rapid and non-destructive method to determine the protein content in intact and ground seeds of pea (Pisum sativum L.) germplasms grown in Korea. A total of 115 samples were scanned in the reflectance mode of a scanning monochromator at intact seed and flour condition, and the reference values for the protein content was measured by auto-Kjeldahl system. In the developed ground and intact NIRS equations for analysis of protein, the most accurate equation were obtained at 2, 8, 6, 1 math treatment conditions with standard normal variate and detrend scatter correction method and entire spectrum (400-2,500 nm) by using modified partial least squares regression (n=78). External validation (n=34) of these NIRS equations showed significant correlation between reference values and NIRS estimated values based on the standard error of prediction (SEP), $R^2$, and the ratio of standard deviation of reference data to SEP. Therefore, these ground and intact NIRS equations can be applicable and reliable for determination of protein content in pea seeds, and non-destructive NIRS method could be used as a mass analysis technique for selection of high protein pea in breeding program and for quality control in food industry.

Prediction of Chemical Compositions for On-line Quality Measurement of Red Pepper Powder Using Near Infrared Reflectance Spectroscopy (NIRS)

  • Lee, Sun-Mee;Kim, Su-Na;Park, Jae-Bok;Hwang, In-Kyeong
    • Food Science and Biotechnology
    • /
    • 제14권2호
    • /
    • pp.280-285
    • /
    • 2005
  • Applicability of near infrared reflectance spectroscopy (NIRS) was examined for quality control of red pepper powder in milling factories. Prediction of chemical composition was performed using modified partial least square (MPLS) techniques. Analysis of total 51 and 21 red pepper powder samples by conventional methods for calibration and validation, respectively, revealed standard error of prediction (SEP) and correlation coefficient ($R^2$) of moisture content, ASTA color value, capsaicinoid content, and total sugar content were 0.55 and 0.90, 8.58 and 0.96, 31.60 and 0.65, and 1.82 and 0.86, respectively; SEP and $R^2$ were low and high, respectively, except for capsaicinoid content. The results indicate, with slight improvement, on-line quality measurement of red pepper powder with NIRS could be applied in red pepper milling factories.

근적외선분광분석기를 이용한 미강의 Tocopherol과 Tocotrienol 함량 분석 (Quantification of Tocopherol and Tocotrienol Content in Rice Bran by Near Infrated Reflectance Spectroscopy)

  • 김용호;강창성;이영상
    • 한국작물학회지
    • /
    • 제49권3호
    • /
    • pp.211-215
    • /
    • 2004
  • 미강에 함유되어 있는 토코페롤 및 토코트리에놀의 함량을 비파괴적으로 신속하게 추정하기 위하여 NIRS(근적외선 분광분석기)를 이용한 분석 방법을 검토하였다. 벼 유전자원 80계통의 미장을 사용하여 HPLC에서 분석된 토코페롤 및 토코트리에놀의 함량치를 NIRS 스펙트럼에 적용시킨 후 검량식을 작성하였다. NIRS의 검량식을 몇가지 방법에 의하여 비교 분석한 결과 2차 미분된 스펙트럼을 MPLS(Modified Partial Least Squares)를 이용한 회귀식에 이용하는 것이 가장 적합하였다. HPLC를 이용한 유전자원들의 성분 함량과 NIRS에서 도출된 검량식과의 상관계수는 토코페롤과 토코트리에놀이 각각 0.992, 0.953을 나타내었다. 이들 검량식은 validation file 에서도 0.846 및 0.956의 높은 상관을 보여 미강 상태에서 토코페롤 및 토코트리에놀의 함량을 NIRS를 이용하여 신속하게 분석할 수 있을 것으로 판단되었다.

NIRS를 이용한 삼지구엽초의 이카린 함량 분석 (Quantification of Icariin Contents in Epimedium koreanum N. by Using a Near Infrared Reflectance Spectroscopy)

  • 김용호;최병열;백흠영;이영상
    • 한국약용작물학회지
    • /
    • 제10권5호
    • /
    • pp.340-343
    • /
    • 2002
  • 삼지구엽초에 함유되어 있는 icariin 함량을 신속하게 추정하기 위하여 NIRS(근적외선 분광분석기)를 이용한 분석 방법을 검토하였다. HPLC를 이용하여 분석된 삼지구엽초 유전자원 150계통에 대한 이카린 함량치를 NIRS 스펙트럼에 적용시켜 42개의 calibration set 와 26개의 valilion set를 구분하였다. NIRS의 검량식을 몇가지 방법에 의하여 비교분석한 결과 2차미분된 스텍트럼을 MPLS(Modified Partial Least Squares)를 이용한 회귀식에 이용하는 것이 가장 적합하였다. HPLC를 이용한 유전자원들의 이카린 함량은 평균 $0.424%(0.12{\sim}0.67%)$이었으며, NIRS에서 도출된 검량식과의 상관계수는 0.951을 나타내었다. 따라서 삼지구엽초의 이카린 함량은 NIRS를 이용하여 신속 편리하게 분석할 수 있음이 인정되었다.

근적외선분광광도계(NIRS)를 이용한 국내산 콩과 수입콩의 판별분석 (Discrimination of Korean Domestic and Foreign Soybeans using Near Infrared Reflectance Spectroscopy)

  • 안형균;김용호
    • 한국작물학회지
    • /
    • 제57권3호
    • /
    • pp.296-300
    • /
    • 2012
  • 국내산 콩과 수입콩의 판별에 NIRS를 도입함으로써 보다 빠르고 정확한 식별분석을 하고자 실험을 수행하였다. NIRS를 사용하여 400~2,500 nm 범위에서 콩 분말의 파장을 측정하였으며, 측정된 spectrum은 WINISI II program 을 이용하여 수처리와 회귀분석을 하였다. 검량식 작성을 위한 수처리는 spectrum을 1차미분 및 4 nm gap으로 조정한 것이 가장 적합하였으며, 회귀식은 변형부분최소자승회귀법(Modified partial least squares regression)이 우수하였다. MPLS 회귀분석시 원산지 판별을 위해 loading value를 국내산 콩은 '100', 수입콩은 '1'로 처리하여 검량식을 작성하고 그 적합성을 검증한 결과 factor가 10일 때 도출된 calibration equation의 상관값이 0.98, 교차검증의 상관값이 0.94를 나타내어 상관도가 높음을 알 수 있었다. 따라서 NIRS를 이용한 국내산 콩과 수입콩의 판별분석이 가능할 것으로 판단되었다.

해외직접구매 소비자의 브랜드이미지와 구매의도 간 지각된가치, 가격민감도, 만족도의 구조적 관계 연구 (A Study on the Structural Relationship of Perceived Value, Price Sensitivity, and Satisfaction between Brand Image and Purchase Intention in Overseas Direct Purchase)

  • 정분도;김지훈
    • 무역학회지
    • /
    • 제44권6호
    • /
    • pp.169-185
    • /
    • 2019
  • The purpose of this study is to analyze the structural relationships of perceived value, price sensitivity, and satisfaction between brand image and purchase intention of consumers who have experience of overseas direct purchase. This study collected questionnaires used to analyze these structural relationships. Using the R's plspm package, we analyzed the PLS (partial least squares) structural equation model. In order to examine the relationship between perceived value and price sensitivity, the research model was modified and analyzed. As a result, not only the adoption of the research hypothesis, but also the goodness of fit was higher than before the research model modifying, and the relationship between perceived value and price sensitivity was further verified. The modified research model has higher academic value, so it is necessary to select it as the final proposal model.

RAPID PREDICTION OF ENERGY CONTENT IN CEREAL FOOD PRODUCTS WITH NIRS.

  • Kays, Sandra E.;Barton, Franklin E.
    • 한국근적외분광분석학회:학술대회논문집
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.1511-1511
    • /
    • 2001
  • Energy content, expressed as calories per gram, is an important part of the evaluation and marketing of foods in developed countries. Currently accepted methods of measurement of energy by U.S. food labeling legislation include measurement of gross calories by bomb calorimetry with an adjustment for undigested protein and by calculation using specific factors for the energy values of protein, carbohydrate less the amount of insoluble dietary fiber, and total fat. The ability of NIRS to predict the energy value of diverse, processed and unprocessed cereal food products was investigated. NIR spectra of cereal products were obtained with an NIR Systems monochromator and the wavelength range used for analysis was 1104-2494 nm. Gross energy of the foods was measured by oxygen bomb calorimetry (Parr Manual No. 120) and expressed as calories per gram (CPGI, range 4.05-5.49 cal/g). Energy value was adjusted for undigested protein (CPG2, range 3.99-5.38 cal/g) and undigested protein and insoluble dietary fiber (CPG3, range 2.42-5.35 cal/g). Using a multivariate analysis software package (ISI International, Inc.) partial least squares models were developed for the prediction of energy content. The standard error of cross validation and multiple coefficient of determination for CPGI using modified partial least squares regression (n=127) was 0.060 cal/g and 0.95, respectively, and the standard error of performance, coefficient of determination, bias and slope using an independent validation set (n=59) were 0.057 cal/g, 0.98, -0.027 cal/g and 1.05 respectively. The PLS loading for factor 1 (Pearson correlation coefficient 0.92) had significant absorption peaks correlated to C-H stretch groups in lipid at 1722/1764 nm and 2304/2346 nm and O-H groups in carbohydrate at 1434 and 2076 nm. Thus the model appeared to be predominantly influenced by lipid and carbohydrate. Models for CPG2 and CPG3 showed similar trends with standard errors of performance, using the independent validation set, of 0.058 and 0.088 cal/g, respectively, and coefficients of determination of 0.96. Thus NIRS provides a rapid and efficient method of predicting energy content of diverse cereal foods.

  • PDF