• Title/Summary/Keyword: 부분최소제곱회귀(PLSR)

Search Result 12, Processing Time 0.029 seconds

Partial least squares regression theory and application in spectroscopic diagnosis of total hemoglobin in whole blood (부분최소제곱회귀(Partial Least Squares Regression) 이론과 분광학적 혈중 헤모글로빈 진단에의 응용)

  • 김선우;김연주;김종원;윤길원
    • The Korean Journal of Applied Statistics
    • /
    • v.10 no.2
    • /
    • pp.227-239
    • /
    • 1997
  • PLSR is a powerful multivariate statistical tool that has been successfully applied to the quantitative analyses of data in spectroscopy, chemistry, and industrial process control. Data in spectorscopy is represented by spectrum matrix measured in many wavelengths. Problems of many kinds of noise in data and itercorrelation between wavelengths are quite common in such data. PLSR utilizes whole data set measured in many wavelengths to the analysis, and handles such problems through data compression method. We investigated the PLSR theory, and applied this method to the data for spectroscopic diagnosis of Total Hemoglobin in whole blood.

  • PDF

Development of Virtual Metrology Models in Semiconductor Manufacturing Using Genetic Algorithm and Kernel Partial Least Squares Regression (유전알고리즘과 커널 부분최소제곱회귀를 이용한 반도체 공정의 가상계측 모델 개발)

  • Kim, Bo-Keon;Yum, Bong-Jin
    • IE interfaces
    • /
    • v.23 no.3
    • /
    • pp.229-238
    • /
    • 2010
  • Virtual metrology (VM), a critical component of semiconductor manufacturing, is an efficient way of assessing the quality of wafers not actually measured. This is done based on a model between equipment sensor data (obtained for all wafers) and the quality characteristics of wafers actually measured. This paper considers principal component regression (PCR), partial least squares regression (PLSR), kernel PCR (KPCR), and kernel PLSR (KPLSR) as VM models. For each regression model, two cases are considered. One utilizes all explanatory variables in developing a model, and the other selects significant variables using the genetic algorithm (GA). The prediction performances of 8 regression models are compared for the short- and long-term etch process data. It is found among others that the GA-KPLSR model performs best for both types of data. Especially, its prediction ability is within the requirement for the short-term data implying that it can be used to implement VM for real etch processes.

Estimated Soft Information based Most Probable Classification Scheme for Sorting Metal Scraps with Laser-induced Breakdown Spectroscopy (레이저유도 플라즈마 분광법을 이용한 폐금속 분류를 위한 추정 연성정보 기반의 최빈 분류 기술)

  • Kim, Eden;Jang, Hyemin;Shin, Sungho;Jeong, Sungho;Hwang, Euiseok
    • Resources Recycling
    • /
    • v.27 no.1
    • /
    • pp.84-91
    • /
    • 2018
  • In this study, a novel soft information based most probable classification scheme is proposed for sorting recyclable metal alloys with laser induced breakdown spectroscopy (LIBS). Regression analysis with LIBS captured spectrums for estimating concentrations of common elements can be efficient for classifying unknown arbitrary metal alloys, even when that particular alloy is not included for training. Therefore, partial least square regression (PLSR) is employed in the proposed scheme, where spectrums of the certified reference materials (CRMs) are used for training. With the PLSR model, the concentrations of the test spectrum are estimated independently and are compared to those of CRMs for finding out the most probable class. Then, joint soft information can be obtained by assuming multi-variate normal (MVN) distribution, which enables to account the probability measure or a prior information and improves classification performance. For evaluating the proposed schemes, MVN soft information is evaluated based on PLSR of LIBS captured spectrums of 9 metal CRMs, and tested for classifying unknown metal alloys. Furthermore, the likelihood is evaluated with the radar chart to effectively visualize and search the most probable class among the candidates. By the leave-one-out cross validation tests, the proposed scheme is not only showing improved classification accuracies but also helpful for adaptive post-processing to correct the mis-classifications.

Discrimination of Internally Browned Apples Utilizing Near-Infrared Non-Destructive Fruit Sorting System (근적외선 비파괴 과일 선별 시스템을 활용한 내부 갈변 사과의 판별)

  • Kim, Bal Geum;Lim, Jong Guk
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.1
    • /
    • pp.208-213
    • /
    • 2021
  • There is a lack of studies comparing the internal quality of fruit with its external quality. However, issues of internal quality of fruit such as internal browning are important. We propose a method of classifying normal apples and internally browned apples using a near-infrared (NIR) non-destructive system. Specifically, we found the optimal wavelength and characteristics of the spectra for determining the internal browning of Fuji apples. The NIR spectra of apples were obtained in the wavelength range of 470-1150 nm. A group of normal apples and a group of internally browned apples were identified using principal component analysis (PCA), and a partial least squares regression (PLSR) analysis was performed to develop and evaluate the discriminant model. The PCA analysis revealed a clear difference between the normal and internally browned apples. From the PLSR, the correlation coefficient of the predictive model without pretreatment was determined to be 0.902 with an RMSE value of 0.157. The correlation coefficient of the predictive model with pretreatment was 0.906 with an RMSE value of 0.154. The results show that this model is suitable for classifying normal and internally browned apples and that it can be applied for the sorting and evaluation of agricultural products for internal and external defects.

A Study on the Performance Characteristics of Portable Analyzer for Determination of Sugar Content in Citrus Unshiu using Near Infrared Spectroscopy (근적외선 분광기술을 이용한 휴대용 감귤 당도 선과기 성능특성에 관한 연구)

  • Yoon, Sung-Un;Ma, Sang-Dong;Kim, Myung-Yun;Kim, Jae-Yeol
    • Transactions of the Korean Society of Machine Tool Engineers
    • /
    • v.15 no.5
    • /
    • pp.1-6
    • /
    • 2006
  • The purpose of this study is to develop to portable near infrared analyzer measuring the sugar content of the fruits on a tree before harvesting ones. The portable near infrared system consists of a tungsten lamp, a coaxial optical fiber bundle and a multi-channel detector, which has 256 pixels and a concave transmission grating. Reflectance NIR spectra of orange were recorded by using a coaxial optical fiber bundle. The spectra were collected over the spectral range $400{\sim}1100nm$. Partial least squares regression(PLSR) was applied for a calibration and validation for determination of sugar contents. The multiple correlation coefficient was 0.99 and standard errors of calibration(SEC) was 0.069 brix. The calibration model predicted the sugar content for validation set with standard errors of prediction(SEP) of 0.092 brix. The sugar content in fruits was successfully quantified using the portable near infrared analyzer.

휴대형 당도판정센서를 이용한 배의 당도 판정

  • 이강진;최규홍;강석원;최완규;손재룡
    • Proceedings of the Korean Society of Postharvest Science and Technology of Agricultural Products Conference
    • /
    • 2003.04a
    • /
    • pp.120-121
    • /
    • 2003
  • 과수원에서 재배되는 배는 과수원 내의 위치, 시비, 토양 등의 요인에 따라 다양한 품질을 나타내며, 당도와 숙도의 편차가 크기 때문에 과수농가에서는 경험에 의존하여 적정 숙기로 판단되는 배를 수확하고 있다. 그러나 과학적이지 못한 사실에 기초한 수확 관행은 시장유통되는 배에 대하여 소비자들의 신뢰성 저하를 초래하게 되고 소비 감소와 더불어 농가 소득 감소로 이어지게 된다. 최근, 전국의 청과물 산지유통센터에는 근적외선을 이용하여 과일 내부의 당도, 산도, 결함 등을 실시간으로 판정할 수 있는 비파괴 선별기가 보급되고 있으나 이는 수확이후의 선별.규격화 유통을 위한 것이다. 본 연구에서는 이와는 달리, 수확 이전, 즉 재배 단계에서 배의 당도와 숙도를 판정하여 수확적기를 판단할 수 있도록 나무에 매달린 배에 대하여 가시광선과 근적외선 반사스펙트럼을 측정할 수 있고 이를 이용하여 당도와 숙도가 판정가능한 휴대형 센서를 개발하였으며, 개발된 시작기를 이용하여 당도판정의 가능성을 시험하였다. 휴대형 당도판정센서는 광원과 광섬유프로브, 광검출부, 당도판정부, 전원공급부로 구성된다. 광원은 할로겐램프(6V)를 이용하였고, 광섬유프로브는 동심원 형태로서 외부의 광섬유를 통하여 광원에서 시료로 빛이 조사되게 하고, 내부의 광섬유를 통하여 광검출기로 확산반사되는 광이 전달될 수 있도록 하였다. 전원공급부는 휴대와 충전이 가능한 배터리(12V, 2AH)와 이 배터리에서 정전류가 광원으로 보내어 질 수 있도록 제작된 회로로 구성하였다. 당도 판정을 위하여 518nm에서 1046nm의 파장대역에서의 반사스펙트럼을 이용하였고, 레퍼런스로써 백색 테플론 구를 제작하여 사용하였다. 수원 농산물 도매시장에서 판매중인 2002년산 신고 배를 구매하고, 시작기를 이용하여 총 113개의 배에 대한 반사스펙트럼을 측정하였다. 다음으로 굴절당도계로 당도값을 측정하고 반사스펙트림을 이용하여 당도값을 예측하기 위한 부분최소제곱회귀(PLSR)모델을 개발하였다. 여기서 모델의 정밀도는 교차검정법을 이용하여 검증하였다. 시료 표면과 광섬유프로브와의 접촉상태 불균일, 광원의 시간에 따른 경시 변화, 과일 형상의 차이 등에 의하여 측정된 반사스펙트럼은 상당한 변이를 나타내었으므로 이를 보정하기 위하여 반사 스펙트럼은 다분산보정처리하여 이용하였다. 당도 예측용 PLSR모델 개발의 결과, 모델의 결정계수($R^2$)는 0.67, SEC는 $\pm$0.4brix.로 나타났으며, 교차검정에 의한 미지 시료의 예측에서 총 113개의 미지 시료에 대한 결정계수는 0.57, SEP는 $\pm$ 0.46brix.로 나타났으며, 이는 현장에서 충분히 활용가능할 것으로 판단되었다. 금후, 전체 시스템의 부피와 중량을 줄이고 각 부분품들의 전력소모의 최소화할 수 있도록 개선할 계획이다.

  • PDF

Study of Prediction Model Improvement for Apple Soluble Solids Content Using a Ground-based Hyperspectral Scanner (지상용 초분광 스캐너를 활용한 사과의 당도예측 모델의 성능향상을 위한 연구)

  • Song, Ahram;Jeon, Woohyun;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.5_1
    • /
    • pp.559-570
    • /
    • 2017
  • A partial least squares regression (PLSR) model was developed to map the internal soluble solids content (SSC) of apples using a ground-based hyperspectral scanner that could simultaneously acquire outdoor data and capture images of large quantities of apples. We evaluated the applicability of various preprocessing techniques to construct an optimal prediction model and calculated the optimal band through a variable importance in projection (VIP)score. From the 515 bands of hyperspectral images extracted at wavelengths of 360-1019 nm, 70 reflectance spectra of apples were extracted, and the SSC ($^{\circ}Brix$) was measured using a digital photometer. The optimal prediction model wasselected considering the root-mean-square error of cross-validation (RMSECV), root-mean-square error of prediction (RMSEP) and coefficient of determination of prediction $r_p^2$. As a result, multiplicative scatter correction (MSC)-based preprocessing methods were better than others. For example, when a combination of MSC and standard normal variate (SNV) was used, RMSECV and RMSEP were the lowest at 0.8551 and 0.8561 and $r_c^2$ and $r_p^2$ were the highest at 0.8533 and 0.6546; wavelength ranges of 360-380, 546-690, 760, 915, 931-939, 942, 953, 971, 978, 981, 988, and 992-1019 nm were most influential for SSC determination. The PLSR model with the spectral value of the corresponding region confirmed that the RMSEP decreased to 0.6841 and $r_p^2$ increased to 0.7795 as compared to the values of the entire wavelength band. In this study, we confirmed the feasibility of using a hyperspectral scanner image obtained from outdoors for the SSC measurement of apples. These results indicate that the application of field data and sensors could possibly expand in the future.

Establishment of discrimination system using multivariate analysis of FT-IR spectroscopy data from different species of artichoke (Cynara cardunculus var. scolymus L.) (FT-IR 스펙트럼 데이터 기반 다변량통계분석기법을 이용한 아티초크의 대사체 수준 품종 분류)

  • Kim, Chun Hwan;Seong, Ki-Cheol;Jung, Young Bin;Lim, Chan Kyu;Moon, Doo Gyung;Song, Seung Yeob
    • Horticultural Science & Technology
    • /
    • v.34 no.2
    • /
    • pp.324-330
    • /
    • 2016
  • To determine whether FT-IR spectral analysis based on multivariate analysis for whole cell extracts can be used to discriminate between artichoke (Cynara cardunculus var. scolymus L.) plants at the metabolic level, leaves of ten artichoke plants were subjected to Fourier transform infrared(FT-IR) spectroscopy. FT-IR spectral data from leaves were analyzed by principal component analysis (PCA), partial least square discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). FT-IR spectra confirmed typical spectral differences between the frequency regions of 1,700-1,500, 1,500-1,300 and $1,100-950cm^{-1}$, respectively. These spectral regions reflect the quantitative and qualitative variations of amide I, II from amino acids and proteins ($1,700-1,500cm^{-1}$), phosphodiester groups from nucleic acid and phospholipid ($1,500-1,300cm^{-1}$) and carbohydrate compounds ($1,100-950cm^{-1}$). PCA revealed separate clusters that corresponded to their species relationship. Thus, PCA could be used to distinguish between artichoke species with different metabolite contents. PLS-DA showed similar species classification of artichoke. Furthermore these metabolic discrimination systems could be used for the rapid selection and classification of useful artichoke cultivars.

Estimation of Moisture Content in Cucumber and Watermelon Seedlings Using Hyperspectral Imagery (초분광영상 이용 오이 및 수박 묘의 수분함량 추정)

  • Kim, Seong-Heon;Kang, Jeong-Gyun;Ryu, Chan-Seok;Kang, Ye-Seong;Sarkar, Tapash Kumar;Kang, Dong Hyeon;Ku, Yang-Gyu;Kim, Dong-Eok
    • Journal of Bio-Environment Control
    • /
    • v.27 no.1
    • /
    • pp.34-39
    • /
    • 2018
  • This research was conducted to estimate moisture content in cucurbitaceae seedlings, such as cucumber and watermelon, using hyperspectral imagery. Using a hyperspectral image acquisition system, the reflectance of leaf area of cucumber and watermelon seedlings was calculated after providing water stress. Then, moisture content in each seedling was measured by using a dry oven. Finally, using reflectance and moisture content, the moisture content estimation models were developed by PLSR analysis. After developing the estimation models, performance of the cucumber showed 0.73 of $R^2$, 1.45% of RMSE, and 1.58% of RE. Performance of the watermelon showed 0.66 of $R^2$, 1.06% of RMSE, and 1.14% of RE. The model performed slightly better after removing one sample from cucumber seedlings as outlier and unnecessary. Hence, the performance of new model for cucumber seedlings showed 0.79 of $R^2$, 1.10% of RMSE, and 1.20% of RE. The model performance combined with all samples showed 0.67 of $R^2$, 1.26% of RMSE, and 1.36% of RE. The model of cucumber showed better performance than the model of watermelon. This is because variables of cucumber are consisted of widely distributed variation, and it affected the performance. Further, accuracy and precision of the cucumber model were increased when an insignificant sample was eliminated from the dataset. Finally, it is considered that both models can be significantly used to estimate moisture content, as gradients of trend line are almost same and intersected. It is considered that the accuracy and precision of the estimating models possibly can be improved, if the models are constructed by using variables with widely distributed variation. The improved models will be utilized as the basis for developing low-priced sensors.