• Title/Summary/Keyword: PLS Regression

Search Result 175, Processing Time 0.036 seconds

Shrinkage Structure of Ridge Partial Least Squares Regression

  • Kim, Jong-Duk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.327-344
    • /
    • 2007
  • Ridge partial least squares regression (RPLS) is a regression method which can be obtained by combining ridge regression and partial least squares regression and is intended to provide better predictive ability and less sensitive to overfitting. In this paper, explicit expressions for the shrinkage factor of RPLS are developed. The structure of the shrinkage factor is explored and compared with those of other biased regression methods, such as ridge regression, principal component regression, ridge principal component regression, and partial least squares regression using a near infrared data set.

  • PDF

Discrimination of Cultivars and Cultivation Origins from the Sepals of Dry Persimmon Using FT-IR Spectroscopy Combined with Multivariate Analysis (FT-IR 스펙트럼 데이터의 다변량 통계분석을 이용한 곶감의 원산지 및 품종 식별)

  • Hur, Suel Hye;Kim, Suk Weon;Min, Byung Whan
    • Korean Journal of Food Science and Technology
    • /
    • v.47 no.1
    • /
    • pp.20-26
    • /
    • 2015
  • This study aimed to establish a rapid system for discriminating the cultivation origins and cultivars of dry persimmons, using metabolite fingerprinting by Fourier transform infrared (FT-IR) spectroscopy combined with multivariate analysis. Whole-cell extracts from the sepals of four Korean cultivars and two different Chinese dry persimmons were subjected to FT-IR spectroscopy. Principle component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) of the FT-IR spectral data successfully discriminated six dry persimmons into two groups depending on their cultivation origins. Principal component loading values showed that the 1750-1420 and $1190-950cm^{-1}$ regions of the FT-IR spectra were significantly important for the discrimination of cultivation origins. The accuracy of prediction of the cultivation origins and cultivars by PLS regression was 100% (p<0.01) and 85.9% (p<0.05), respectively. These results clearly show that metabolic fingerprinting of FT-IR spectra can be applied for rapid discrimination of the cultivation origins and cultivars of commercial dry persimmons.

Non-Destructive Prediction of Head Rice Ratios using NIR Spectra of Hulled Rice (정조 상태에서 백미에 대한 완전미율의 비파괴 예측)

  • Kwon, Young-Rip;Cho, Seung-Hyun;Lee, Jae-Heung;Seo, Kyoung-Won;Choi, Dong-Chil
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.53 no.3
    • /
    • pp.244-250
    • /
    • 2008
  • The purpose of this study was to measure fundamental data required for the prediction of milling ratios, and to develop regression models to predict the head rice ratio of milled rice using NIR spectra of hulled rice. A total of 81 rice samples used in this study were collected from Jeongeup, Jeonbuk province in 2006. NIR spectra were measured using one mode of measurement, reflection. The reflectance spectra were measured in the wavelength region of 400-2500 nm with an NIR spectrophotometer "NIRSystems 6500" (Foss, Silverspring, USA). Calibration equations were developed by the modified partial least squares (MPLS), partial least squares (PLS), and principal components regression (PCR). Math treatments were 1-4-4-1, 1-10-10-1, 2-4-4-1, and 2-10-10-1. The software used was WinISI (Infrasoft International, State College, USA). Automatic head rice production and quality checking system used was "SY2000-AHRPQCS" (Ssangyong, Korea). The calibration was made with the first derivative and the spectrum designated was in 8 nm interval. The determination coefficients of head rice ratios were 0.8353, 0.8416 and 0.5277 for the MPLS, PLS and PCR, respectively. Those obtained with 20 nm interval were 0.8144, 0.8354 and 0.6908 for the MPLS, PLS and PCR, respectively. The calibration was made with second derivative that spectrum designated was 8 nm in interval. The determination coefficients of head rice ratios were 0.7994, 0.8017 and 0.4473 for the MPLS, PLS and PCR, respectively. Those with 20 nm interval were 0.8004, 0.8493 and 0.6609 for the MPLS, PLS and PCR, respectively. These results indicate that the accuracy of determination coefficient for MPLS and PLS is higher than that of PCR.

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1245-1245
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145-154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1152-1152
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145 154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

PREPROCESSING EFFECTS ON ON-LINE SSC MEASUREMENT OF FUJI APPLE BY NIR SPECTROSCOPY

  • Ryu, D.S.;Noh, S.H.;Hwang, I.G.
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2000.11c
    • /
    • pp.560-568
    • /
    • 2000
  • The aims of this research were to investigate the preprocessing effect of spectrum data on prediction performance and to develop a robust model to predict SSC in intact apple. Spectrum data of 320 Fuji apples were measured with the on-line transmittance measurement system at the wavelength range of 550∼1100nm. Preprocess methods adopted for the tests were Savitzky Golay, MSC, SNV, first derivative and OSC. Several combinations of those methods were applied to the raw spectrum data set to investigate the relative effect of each method on the performance of the calibration model. PLS method was used to regress the preprocessed data set and the SSCs of samples, and the cross-validation was to select the optimal number of PLS factors. Smoothing and scattering corection were essential in increasing the prediction performance of PLS regression model and the OSC contributed to reduction of the number of PLS factors. The first derivative resulted in unfavorable effect on the prediction performance. MSC and SNV showed similar effect. A robust calibration model could be developed by the preprocessing combination of Savitzky Golay smoothing, MSC and OSC, which resulted in SEP= 0.507, bias=0.032 and R$^2$=0.8823.

  • PDF

A Comparison Analysis among Structural Equation Modeling (AMOS, LISREL and PLS) using the Same Data (동일 데이터를 이용한 구조방정식(AMOS, LISREL and PLS) 툴 간의 비교분석)

  • Nam, Soo-tai;Kim, Do-goan;Jin, Chan-yong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.05a
    • /
    • pp.131-134
    • /
    • 2018
  • Structural equation modeling is pointing to statistical procedures that simultaneously perform path analysis and confirmatory factor analysis. Today, this statistical procedure is an essential tool for researchers in the social sciences. There are as (AMOS, LISREL and PLS) representative tools that can perform structural equation modeling analysis. AMOS provides a convenient graphical user interface for beginners to use. PLS has the advantage of not having a constraint on normal distribution as well as a graphical user interface. Therefore, we compared and analyzed the three most commonly used tools in social sciences. This study suggests practical and theoretical implications based on the results.

  • PDF

Multivariate Statistical Analysis Approach to Predict the Reactor Properties and the Product Quality of a Direct Esterification Reactor for PET Synthesis (다변량 통계분석법을 이용한 PET 중합공정 중 직접 에스테르화 반응기의 거동 및 생산제품 예측)

  • Kim Sung Young;Chung Chang Bock;Choi Soo Hyoung;Lee Bomsock;Lee Bomsock
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.6
    • /
    • pp.550-557
    • /
    • 2005
  • The multivariate statistical analysis methods, using both multiple linear regression(MLR) and partial least square(PLS), have been applied to predict the reactor properties and the product quality of a direct esterification reactor for polyethylene terephthalate(PET) synthesis. On the basis of the set of data including the flow rate of water vapor, the flow rate of EG vapor, the concentration of acid end groups of a product and other operating conditions such as temperature, pressure, reaction times and feed monomer mole ratio, two multi-variable analysis methods have been applied. Their regression and prediction abilities also have been compared. The prediction results are critically compared with the actual plant data and the other mathematical model based results in reliability. This paper shows that PLS method approach can be used for the reasonably accurate prediction of a product quality of a direct esterification reactor in PET synthesis process.

A Case Study of Housing Regeneration Projects in Yonnam-dong and Buk Gajwa-dong, Seoul: The Determinants of Satisfaction of Elderly Residents (정비예정구역 해제지역 재생사업의 정비요소와 고령거주자의 사업 만족도 간의 영향관계 사례연구 - 서울시 연남동, 북가좌동 시범사업지를 중심으로 -)

  • Kim, Ah-Leum;Koo, Ja-Hoon
    • Journal of the Korean housing association
    • /
    • v.27 no.5
    • /
    • pp.11-23
    • /
    • 2016
  • The purpose of this study is to establish the determinants of satisfaction with the results of housing regeneration projects among their elderly residents, and to suggest the political implications. The survey included questionnaires about satisfaction levels with the projects' physical and non-physical maintenance factors. The results were statistically analyzed by correlation analysis and PLS regression analysis. As a result of the study, firstly, the physical factors rather than non-physical factors (such as home improvement and management support, community support, the economic foundations and professional support) were found to have a large effect on elderly residents' satisfaction. Secondly, the non-physical factors, such as economic factors were analyzed among senior job offers that are both highly influential in the two regions Yonnam-dong and Bukgajwa-dong. Finally, electrical maintenance work, tree planting, a "Green" parking plan, or refuse the effect of visually larger landscape improvement, such as bins installed, maintenance of local factors that contribute to the greenery of the area were judged to be important.

Selecting Significant Wavelengths to Predict Chlorophyll Content of Grafted Cucumber Seedlings Using Hyperspectral Images

  • Jang, Sung Hyuk;Hwang, Yong Kee;Lee, Ho Jun;Lee, Jae Su;Kim, Yong Hyeon
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.4
    • /
    • pp.681-692
    • /
    • 2018
  • This study was performed to select the significant wavelengths for predicting the chlorophyll content of grafted cucumber seedlings using hyperspectral images. The visible and near-infrared (VNIR) images and the short-wave infrared images of cucumber cotyledon samples were measured by two hyperspectral cameras. A correlation coefficient spectrum (CCS), a stepwise multiple linear regression (SMLR), and partial least squares (PLS) regression were used to determine significant wavelengths. Some wavelengths at 501, 505, 510, 543, 548, 619, 718, 723, and 727 nm were selected by CCS, SMLR, and PLS as significant wavelengths for estimating chlorophyll content. The results from the calibration models built by SMLR and PLS showed fair relationship between measured and predicted chlorophyll concentration. It was concluded that the hyperspectral imaging technique in the VNIR region is suggested effective for estimating the chlorophyll content of grafted cucumber leaves, non-destructively.