• Title/Summary/Keyword: least squares cross-validation

검색결과 88건 처리시간 0.025초

Prediction of Chemical Composition and Fermentation Parameters in Forage Sorghum and Sudangrass Silage using Near Infrared Spectroscopy

  • Park, Hyung-Soo;Lee, Sang-Hoon;Choi, Ki-Choon;Kim, Ji-Hye;So, Min-Jeong;Kim, Hyeon-Seop
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • 제35권3호
    • /
    • pp.257-263
    • /
    • 2015
  • This study was conducted to assess the potential of using NIRS to accurately determine the chemical composition and fermentation parameters in fresh coarse sorghum and sudangrass silage. Near Infrared Spectroscopy (NIRS) has been increasingly used as a rapid and accurate method to analyze the quality of cereals and dried animal forage. However, silage analysis by NIRS has a limitation in analyzing dried and ground samples in farm-scale applications because the fermentative products are lost during the drying process. Fresh coarse silage samples were scanned at 1 nm intervals over the wavelength range of 680~2500 nm, and the optical data were obtained as log 1/Reflectance (log 1/R). The spectral data were regressed, using partial least squares (PLS) multivariate analysis in conjunction with first and second order derivatization, with a scatter correction procedure (standard normal variate and detrend (SNV&D)) to reduce the effect of extraneous noise. The optimum calibrations were selected on the basis of minimizing the standard error of cross validation (SECV). The results of this study showed that NIRS predicted the chemical constituents with a high degree of accuracy (i.e. the correlation coefficient of cross validation ($R^2{_{cv}}$) ranged from 0.86~0.96), except for crude ash which had an $R^2{_{cv}}$ of 0.68. Comparison of the mathematical treatments for raw spectra showed that the second-order derivatization procedure produced the best result for all the treatments, except for neutral detergent fiber (NDF). The best mathematical treatment for moisture, acid detergent fiber (ADF), crude protein (CP) and pH was 2,16,16 respectively while the best mathematical treatment for crude ash, lactic acid and total acid was 2,8,8 respectively. The calibrations of fermentation products produced poorer calibrations (RPD < 2.5) with acetic and butyric acid. The pH, lactic acid and total acids were predicted with considerable accuracy at $R^2{_{cv}}$ 0.72~0.77. This study indicated that NIRS calibrations based on fresh coarse sorghum and sudangrass silage spectra have the capability of assessing the forage quality control

Prediction of the Chemical Composition and Fermentation Parameters of Fresh Coarse Italian Ryegrass Haylage using Near Infrared Spectroscopy

  • Kim, Ji Hye;Park, Hyung Soo;Choi, Ki Choon;Lee, Sang Hoon;Lee, Ki-Won
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • 제37권4호
    • /
    • pp.350-357
    • /
    • 2017
  • Near infrared spectroscopy (NIRS) is a rapid and accurate method for analyzing the quality of cereals, and dried animal forage. However, one limitation of this method is its inability to measure fermentation parameters in dried and ground samples because they are volatile, and therefore, respectively lost during the drying process. In order to overcome this limitation, in this study, fresh coarse haylage was used to test the potential of NIRS to accurately determine chemical composition and fermentation parameters. Fresh coarse Italian ryegrass haylage samples were scanned at 1 nm intervals over a wavelength range of 680 to 2500 nm, and optical data were recorded as log 1/reflectance. Spectral data, together with first- and second-order derivatives, were analyzed using partial least squares (PLS) multivariate regressions; scatter correction procedures (standard normal variate and detrend) were used in order to reduce the effect of extraneous noise. Optimum calibrations were selected based on their low standard error of cross validation (SECV) values. Further, ratio of performance deviation, obtained by dividing the standard deviation of reference values by SECV values, was used to evaluate the reliability of predictive models. Our results showed that the NIRS method can predict chemical constituents accurately (correlation coefficient of cross validation, $R_{cv}^2$, ranged from 0.76 to 0.97); the exception to this result was crude ash ($R_{cv}^2=0.49$ and RPD = 2.09). Comparison of mathematical treatments for raw spectra showed that second-order derivatives yielded better predictions than first-order derivatives. The best mathematical treatment for DM, ADF, and NDF, respectively was 2, 16, 16, whereas the best mathematical treatment for CP and crude ash, respectively was 2, 8, 8. The calibration models for fermentation parameters had low predictive accuracy for acetic, propionic, and butyric acids (RPD < 2.5). However, pH, and lactic and total acids were predicted with considerable accuracy ($R_{cv}^2$ 0.73 to 0.78; RPD values exceeded 2.5), and the best mathematical treatment for them was 1, 8, 8. Our findings show that, when fresh haylage is used, NIRS-based calibrations are reliable for the prediction of haylage characteristics, and therefore useful for the assessment of the forage quality.

A Method of Feature Extraction on Motor Imagery EEG Using FLD and PCA Based on Sub-Band CSP (서브 밴드 CSP기반 FLD 및 PCA를 이용한 동작 상상 EEG 특징 추출 방법 연구)

  • Park, Sang-Hoon;Lee, Sang-Goog
    • Journal of KIISE
    • /
    • 제42권12호
    • /
    • pp.1535-1543
    • /
    • 2015
  • The brain-computer interface obtains a user's electroencephalogram as a replacement communication unit for the disabled such that the user is able to control machines by simply thinking instead of using hands or feet. In this paper, we propose a feature extraction method based on a non-selected filter by SBCSP to classify motor imagery EEG. First, we divide frequencies (4~40 Hz) into 4-Hz units and apply CSP to each Unit. Second, we obtain the FLD score vector by combining FLD results. Finally, the FLD score vector is projected onto the optimal plane for classification using PCA. We use BCI Competition III dataset IVa, and Extracted features are used as input for LS-SVM. The classification accuracy of the proposed method was evaluated using $10{\times}10$ fold cross-validation. For subjects 'aa', 'al', 'av', 'aw', and 'ay', results were $85.29{\pm}0.93%$, $95.43{\pm}0.57%$, $72.57{\pm}2.37%$, $91.82{\pm}1.38%$, and $93.50{\pm}0.69%$, respectively.

Nonparametic Kernel Regression model for Rating curve (수위-유량곡선을 위한 비매개 변수적 Kernel 회귀모형)

  • Moon, Young-Il;Cho, Sung-Jin;Chun, Si-Young
    • Journal of Korea Water Resources Association
    • /
    • 제36권6호
    • /
    • pp.1025-1033
    • /
    • 2003
  • In common with workers in hydrologic fields, scientists and engineers relate one variable to two or more other variables for purposes of predication, optimization, and control. Statistics methods have improved to establish such relationships. Regression, as it is called, is indeed the most commonly used statistics technique in hydrologic fields; relationship between the monitored variable stage and the corresponding discharges(rating curve). Regression methods expressed in the form of mathematical equations which has parameters, so called parametric methods. some times, the establishment of parameters is complicated and uncertain. Many non-parametric regression methods which have not parameters, have been proposed and studied. The most popular of these are kernel regression method. Kernel regression offer a way of estimation the regression function without the specification of a parametric model. This paper conducted comparisons of some bandwidth selection methods which are using the least squares and cross-validation.

Wavelength selection by loading vector analysis in determining total protein in human serum using near-infrared spectroscopy and Partial Least Squares Regression

  • Kim, Yoen-Joo;Yoon, Gil-Won
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 한국근적외분광분석학회 2001년도 NIR-2001
    • /
    • pp.4102-4102
    • /
    • 2001
  • In multivariate analysis, absorbance spectrum is measured over a band of wavelengths. One does not often pay attention to the size of this wavelength band. However, it is desirable that spectrum is measured at only necessary wavelengths as long as the acceptable accuracy of prediction can be met. In this paper, the method of selecting an optimal band of wavelengths based on the loading vector analysis was proposed and applied for determining total protein in human serum using near-infrared transmission spectroscopy and PLSR. Loading vectors in the full spectrum PLSR were used as reference in selecting wavelengths, but only the first loading vector was used since it explains the spectrum best. Absorbance spectra of sera from 97 outpatients were measured at 1530∼1850 nm with an interval of 2 nm. Total protein concentrations of sera were ranged from 5.1 to 7.7 g/㎗. Spectra were measured by Cary 5E spectrophotometer (Varian, Australia). Serum in the 5 mm-pathlength cuvette was put in the sample beam and air in the reference beam. Full spectrum PLSR was applied to determine total protein from sera. Next, the wavelength region of 1672∼1754 nm was selected based on the first loading vector analysis. Standard Error of Cross Validation (SECV) of full spectrum (1530∼l850 nm) PLSR and selected wavelength PLSR (1672∼1754 nm) was respectively 0.28 and 0.27 g/㎗. The prediction accuracy between the two bands was equal. Wavelength selection based on loading vector in PLSR seemed to be simple and robust in comparison to other methods based on correlation plot, regression vector and genetic algorithm. As a reference of wavelength selection for PLSR, the loading vector has the advantage over the correlation plot since the former is based on multivariate model whereas the latter, on univariate model. Wavelength selection by the first loading vector analysis requires shorter computation time than that by genetic algorithm and needs not smoothing.

  • PDF

Discrimination of cultivation ages and cultivars of ginseng leaves using Fourier transform infrared spectroscopy combined with multivariate analysis

  • Kwon, Yong-Kook;Ahn, Myung Suk;Park, Jong Suk;Liu, Jang Ryol;In, Dong Su;Min, Byung Whan;Kim, Suk Weon
    • Journal of Ginseng Research
    • /
    • 제38권1호
    • /
    • pp.52-58
    • /
    • 2014
  • To determine whether Fourier transform (FT)-IR spectral analysis combined with multivariate analysis of whole-cell extracts from ginseng leaves can be applied as a high-throughput discrimination system of cultivation ages and cultivars, a total of total 480 leaf samples belonging to 12 categories corresponding to four different cultivars (Yunpung, Kumpung, Chunpung, and an open-pollinated variety) and three different cultivation ages (1 yr, 2 yr, and 3 yr) were subjected to FT-IR. The spectral data were analyzed by principal component analysis and partial least squares-discriminant analysis. A dendrogram based on hierarchical clustering analysis of the FT-IR spectral data on ginseng leaves showed that leaf samples were initially segregated into three groups in a cultivation age-dependent manner. Then, within the same cultivation age group, leaf samples were clustered into four subgroups in a cultivar-dependent manner. The overall prediction accuracy for discrimination of cultivars and cultivation ages was 94.8% in a cross-validation test. These results clearly show that the FT-IR spectra combined with multivariate analysis from ginseng leaves can be applied as an alternative tool for discriminating of ginseng cultivars and cultivation ages. Therefore, we suggest that this result could be used as a rapid and reliable F1 hybrid seed-screening tool for accelerating the conventional breeding of ginseng.

Attenuated total reflection Fourier transform infrared as a primary screening method for cancer in canine serum

  • Macotpet, Arayaporn;Pattarapanwichien, Ekkachai;Chio-Srichan, Sirinart;Daduang, Jureerut;Boonsiri, Patcharee
    • Journal of Veterinary Science
    • /
    • 제21권1호
    • /
    • pp.16.1-16.10
    • /
    • 2020
  • Cancer is a major cause of death in dogs worldwide, and the incidence of cancer in dogs is increasing. The attenuated total reflection Fourier transform infrared spectroscopic (ATR-FTIR) technique is a powerful tool for the diagnosis of several diseases. This method enables samples to be examined directly without pre-preparation. In this study, we evaluated the diagnostic value of ATR-FTIR for the detection of cancer in dogs. Cancer-bearing dogs (n = 30) diagnosed by pathologists and clinically healthy dogs (n = 40) were enrolled in this study. Peripheral blood was collected for clinicopathological diagnosis. ATR-FTIR spectra were acquired, and principal component analysis was performed on the full wave number spectra (4,000-650 cm-1). The leave-one-out cross validation technique and partial least squares regression analysis were used to predict normal and cancer spectra. Red blood cell counts, hemoglobin levels and white blood cell counts were significantly lower in cancer-bearing dogs than in clinically healthy dogs (p < 0.01, p < 0.01 and p = 0.03, respectively). ATR-FTIR spectra showed significant differences between the clinically healthy and cancer-bearing groups. This finding demonstrates that ATR-FTIR can be applied as a screening technique to distinguish between cancer-bearing dogs and healthy dogs.

Prediction of the Digestibility and Energy Value of Corn Silage by Near Infrared Reflectance Spectroscopy (근적외선분광법을 이용한 옥수수 사일리지의 소화율 및 에너지 평가)

  • Park Hyung-Soo;Lee Jong-Kyung;Lee Hyo-Won;Kim Su-Gon;Ha Jong-Kyu
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • 제26권1호
    • /
    • pp.45-52
    • /
    • 2006
  • This study was carried out to explore the accuracy of Near Infrared Reflectance Spectroscopy (NIRS) fer the prediction of digestibility and energy value of corn silages. The spectral data were regressed against a range of digestibility and energy parameters using modified partial least squares(MPLS) multivariate analysis in conjunction with first and second order derivatization, with scatter correction procedure(SNV-Detrend) to reduce the effect of extraneous noise. Calibration models for NIRS measurements gave multivariate correlation coefficients of determination$(R^2)$ and standard errors of cross validation of 0.92(SECV 1.73), 0.91(SECV 1.13) and 0.93(SECV 1.74) for in vitro dry matter digestibility(IVDMD), in vitro true digestibility(IVTD), and cellulase dry matter digestibility(CDMD), respectively. The standard error of prediction(SEP) and the multiple correlation coefficient of validation$(R^2v)$ on the validation set(n=39) was used in comparing the prediction accuracy. The SEP value was 0.30(TDN), 0.01(NEL), and 0.01(ME). The relative ability of NIRS to predict digestibility and energy value was very good for CDMD, total digestible nutrients(TDN), net energy fer lactation(NEL) and metabolizable energy(ME). This paper shows the potential of NIRS to predict the digestibility and energy value of con silage as a routine method in feeding programmes and for giving advice to farmers.

Study of Prediction Model Improvement for Apple Soluble Solids Content Using a Ground-based Hyperspectral Scanner (지상용 초분광 스캐너를 활용한 사과의 당도예측 모델의 성능향상을 위한 연구)

  • Song, Ahram;Jeon, Woohyun;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • 제33권5_1호
    • /
    • pp.559-570
    • /
    • 2017
  • A partial least squares regression (PLSR) model was developed to map the internal soluble solids content (SSC) of apples using a ground-based hyperspectral scanner that could simultaneously acquire outdoor data and capture images of large quantities of apples. We evaluated the applicability of various preprocessing techniques to construct an optimal prediction model and calculated the optimal band through a variable importance in projection (VIP)score. From the 515 bands of hyperspectral images extracted at wavelengths of 360-1019 nm, 70 reflectance spectra of apples were extracted, and the SSC ($^{\circ}Brix$) was measured using a digital photometer. The optimal prediction model wasselected considering the root-mean-square error of cross-validation (RMSECV), root-mean-square error of prediction (RMSEP) and coefficient of determination of prediction $r_p^2$. As a result, multiplicative scatter correction (MSC)-based preprocessing methods were better than others. For example, when a combination of MSC and standard normal variate (SNV) was used, RMSECV and RMSEP were the lowest at 0.8551 and 0.8561 and $r_c^2$ and $r_p^2$ were the highest at 0.8533 and 0.6546; wavelength ranges of 360-380, 546-690, 760, 915, 931-939, 942, 953, 971, 978, 981, 988, and 992-1019 nm were most influential for SSC determination. The PLSR model with the spectral value of the corresponding region confirmed that the RMSEP decreased to 0.6841 and $r_p^2$ increased to 0.7795 as compared to the values of the entire wavelength band. In this study, we confirmed the feasibility of using a hyperspectral scanner image obtained from outdoors for the SSC measurement of apples. These results indicate that the application of field data and sensors could possibly expand in the future.

Mathematical Transformation Influencing Accuracy of Near Infrared Spectroscopy (NIRS) Calibrations for the Prediction of Chemical Composition and Fermentation Parameters in Corn Silage (수 처리 방법이 근적외선분광법을 이용한 옥수수 사일리지의 화학적 조성분 및 발효품질의 예측 정확성에 미치는 영향)

  • Park, Hyung-Soo;Kim, Ji-Hye;Choi, Ki-Choon;Kim, Hyeon-Seop
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • 제36권1호
    • /
    • pp.50-57
    • /
    • 2016
  • This study was conducted to determine the effect of mathematical transformation on near infrared spectroscopy (NIRS) calibrations for the prediction of chemical composition and fermentation parameters in corn silage. Corn silage samples (n=407) were collected from cattle farms and feed companies in Korea between 2014 and 2015. Samples of silage were scanned at 1 nm intervals over the wavelength range of 680~2,500 nm. The optical data were recorded as log 1/Reflectance (log 1/R) and scanned in intact fresh condition. The spectral data were regressed against a range of chemical parameters using partial least squares (PLS) multivariate analysis in conjunction with several spectral math treatments to reduce the effect of extraneous noise. The optimum calibrations were selected based on the highest coefficients of determination in cross validation ($R^2{_{cv}}$) and the lowest standard error of cross validation (SECV). Results of this study revealed that the NIRS method could be used to predict chemical constituents accurately (correlation coefficient of cross validation, $R^2{_{cv}}$, ranging from 0.77 to 0.91). The best mathematical treatment for moisture and crude protein (CP) was first-order derivatives (1, 16, 16, and 1, 4, 4), whereas the best mathematical treatment for neutral detergent fiber (NDF) and acid detergent fiber (ADF) was 2, 16, 16. The calibration models for fermentation parameters had lower predictive accuracy than chemical constituents. However, pH and lactic acids were predicted with considerable accuracy ($R^2{_{cv}}$ 0.74 to 0.77). The best mathematical treatment for them was 1, 8, 8 and 2, 16, 16, respectively. Results of this experiment demonstrate that it is possible to use NIRS method to predict the chemical composition and fermentation quality of fresh corn silages as a routine analysis method for feeding value evaluation to give advice to farmers.