• Title/Summary/Keyword: Validation data set

Search Result 381, Processing Time 0.029 seconds

Use of Near-Infrared Spectroscopy for Estimating Fatty Acid Composition in Intact Seeds of Rapeseed

  • Kim, Kwan-Su;Park, Si-Hyung;Choung, Myoung-Gun;Jang, Young-Seok
    • Journal of Crop Science and Biotechnology
    • /
    • v.10 no.1
    • /
    • pp.13-18
    • /
    • 2007
  • Near-infrared spectroscopy(NIRS) was used as a rapid and nondestructive method to determine the fatty acid composition in intact seed samples of rapeseed(Brassica napus L.). A total of 349 samples(about 2 g of intact seeds) were scanned in the reflectance mode of a scanning monochromator, and the reference values for fatty acid composition were measured by gas-liquid chromatography. Calibration equations for individual fatty acids were developed using the regression method of modified partial least-squares with internal cross validation(n=249). The equations had low SECV(standard errors of cross-validation), and high $R^2$(coefficient of determination in calibration) values(>0.8) except for palmitic and eicosenoic acid. Prediction of an external validation set(n=100) showed significant correlation between reference values and NIRS estimated values based on the SEP(standard error of prediction), $r^2$(coefficient of determination in prediction), and the ratio of standard deviation(SD) of reference data to SEP. The models developed in this study had relatively higher values(> 3.0 and 0.9, respectively) of SD/SEP(C) and $r^2$ for oleic, linoleic, and erucic acid, characterizing those equations as having good quantitative information. The results indicated that NIRS could be used to rapidly determine the fatty acid composition in rapeseed seeds in the breeding programs for high quality rapeseed oil.

  • PDF

Simultaneous Analysis of three Marker Components in Hwangryunhaedok-tang by HPLC-DAD (황련해독탕 중 3종 생리활성 물질의 HPLC-DAD 동시 정량분석법 확립)

  • Yang, Hye-Jin;Weon, Jin-Bae;Ma, Jin-Yeul;Ma, Choong-Je
    • YAKHAK HOEJI
    • /
    • v.55 no.1
    • /
    • pp.64-68
    • /
    • 2011
  • In this study, a high performance liquid chromatography-diode array detector method was established, for simultaneous determination of three compounds, berberine, palmatine and geniposide in Hwangryunhaedok-tang, To develop and validate method, $C_{18}$ column (5 ${\mu}M$, 4.6 mm${\times}$250 mm) was used with gradient mobile phase, water containing 0.1% trifluoroacetic acid (TFA) and MeOH at the column temperature of $30^{\circ}C$. UV wavelength was set at 230 and 280 nm. Validation of the chromatography method was evaluated by linearity, precision and accuracy test. Calibration curve of standard components showed good linearity ($R^2$ > 0.9999). The limits of detection (LOD) and limits of quantification (LOQ) varied from 0.05 to 0.17 ${\mu}g/ml$ and 0.15 to 0.53 ${\mu}g/ml$, respectively. The relative standard deviations (RSDs) data of intra-day and inter-day test were in less than 2.99% and 1.90%, respectively. The results of the accuracy test were in the range of 98.36 to 102.52% with RSDs values 0.32 to 1.98%. The results of validation indicated that this method was a very accurate and sensitive assay.

A Melon Fruit Grading Machine Using a Miniature VIS/NIR Spectrometer: 1. Calibration Models for the Prediction of Soluble Solids Content and Firmness

  • Suh, Sang-Ryong;Lee, Kyeong-Hwan;Yu, Seung-Hwa;Shin, Hwa-Sun;Choi, Young-Soo;Yoo, Soo-Nam
    • Journal of Biosystems Engineering
    • /
    • v.37 no.3
    • /
    • pp.166-176
    • /
    • 2012
  • Purpose: This study was conducted to investigate the potential of interactance mode of NIR spectroscopy technology for the estimation of soluble solids content (SSC) and firmness of muskmelons. Methods: Melon samples were taken from local greenhouses in three different harvesting seasons (experiments 1, 2, and 3). The fruit attributes were measured at the 6 points on an equator of each sample where the spectral data were collected. The prediction models were developed using the original spectral data and the spectral data sets preprocessed by 20 methods. The performance of the models was compared. Results: In the prediction of SSC, the highest coefficient of determination ($R_{cv}{^2}$) values of the cross-validation was 0.755 (standard error of prediction, SEP=$0.89^{\circ}Brix$) with the preprocessing of normalization with range in experiment 1. The highest coefficient of determination in the robustness tests, $R_{rt}{^2}$=0.650 (SEP=$1.03^{\circ}Brix$), was found when the best model of experiment 3 was evaluated with the data set of experiment 2. The best $R_{cv}{^2}$ for the prediction of firmness was 0.715 (SEP=3.63 N) when no preprocessing was applied in experiment 1. The highest $R_{rt}{^2}$ was 0.404 (SEP=5.30 N) when the best model of experiment 3 was applied to the data set of experiment 1. Conclusions: From the test results, it can be concluded that the interactance mode of VIS/NIR spectroscopy technology has a great potential to measure SSC and firmness of thick-skinned muskmelons.

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

Prediction of ultimate load capacity of concrete-filled steel tube columns using multivariate adaptive regression splines (MARS)

  • Avci-Karatas, Cigdem
    • Steel and Composite Structures
    • /
    • v.33 no.4
    • /
    • pp.583-594
    • /
    • 2019
  • In the areas highly exposed to earthquakes, concrete-filled steel tube columns (CFSTCs) are known to provide superior structural aspects such as (i) high strength for good seismic performance (ii) high ductility (iii) enhanced energy absorption (iv) confining pressure to concrete, (v) high section modulus, etc. Numerous studies were reported on behavior of CFSTCs under axial compression loadings. This paper presents an analytical model to predict ultimate load capacity of CFSTCs with circular sections under axial load by using multivariate adaptive regression splines (MARS). MARS is a nonlinear and non-parametric regression methodology. After careful study of literature, 150 comprehensive experimental data presented in the previous studies were examined to prepare a data set and the dependent variables such as geometrical and mechanical properties of circular CFST system have been identified. Basically, MARS model establishes a relation between predictors and dependent variables. Separate regression lines can be formed through the concept of divide and conquers strategy. About 70% of the consolidated data has been used for development of model and the rest of the data has been used for validation of the model. Proper care has been taken such that the input data consists of all ranges of variables. From the studies, it is noted that the predicted ultimate axial load capacity of CFSTCs is found to match with the corresponding experimental observations of literature.

Application of Bayesian network for farmed eel safety inspection in the production stage (양식뱀장어 생산단계 안전성 조사를 위한 베이지안 네트워크 모델의 적용)

  • Seung Yong Cho
    • Food Science and Preservation
    • /
    • v.30 no.3
    • /
    • pp.459-471
    • /
    • 2023
  • The Bayesian network (BN) model was applied to analyze the characteristic variables that affect compliance with safety inspections of farmed eel during the production stage, using the data from 30,063 cases of eel aquafarm safety inspection in the Integrated Food Safety Information Network (IFSIN) from 2012 to 2021. The dataset for establishing the BN model included 77 non-conforming cases. Relevant HACCP data, geographic information about the aquafarms, and environmental data were collected and mapped to the IFSIN data to derive explanatory variables for nonconformity. Aquafarm HACCP certification, detection history of harmful substances during the last 5 y, history of nonconformity during the last 5 y, and the suitability of the aquatic environment as determined by the levels of total coliform bacteria and total organic carbon were selected as the explanatory variables. The highest achievable eel aquafarm noncompliance rate by manipulating the derived explanatory variables was 24.5%, which was 94 times higher than the overall farmed eel noncompliance rate reported in IFSIN between 2017 and 2021. The established BN model was validated using the IFSIN eel aquafarm inspection results conducted between January and August 2022. The noncompliance rate in the validation set was 0.22% (15 nonconformances out of 6,785 cases). The precision of BN model prediction was 0.1579, which was 71.4 times higher than the non-compliance rate of the validation set.

A Study of AI-based Monitoring Techniques for Land-based Debris in Stream (AI기반 하천 부유쓰레기 모니터링 기술 연구)

  • Kyungsu Lee;Haein Yoon;Jonghwa Won;Sang Hwa Jung
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.137-137
    • /
    • 2023
  • 해양쓰레기는 해안의 심미적 가치 저하뿐만 아니라 생태계 파괴, 유령 어업에 따른 수산업 피해 등의 사회적·환경적 문제를 발생시키며, 그중 70% 이상은 육상 기인으로 플라스틱 및 기타 쓰레기가 주를 이루는 해외와 달리 국내의 경우 다량의 초목류를 포함하고 있다. 다양한 부유쓰레기에 대한 기존의 해양쓰레기량 추정의 한계와 하천·하구 쓰레기 수거의 효율화를 위해 해양으로 유입되는 부유쓰레기 방지를 위한 실효성 있는 대책 수립이 필요한 실정이다. 본 연구는 해양 유입 전 하천의 차단시설에 차집된 부유쓰레기의 수거 효율화 및 지속가능한 해양쓰레기 데이터 구축을 위해 AI기반의 기술을 통해 부유쓰레기 성상 분석 기법(Object Detection)과 차집량 분석 기법(Semantic Segmentation)을 활용하였다. 실제와 유사한 데이터 수집을 위해 다양한 하천 환경(정수조, 소하천, 급경사수로)에 대해 탁도(녹조, 유사), 광량, 쓰레기형상, 초목류 함량, 날씨(소하천), 유속(급경사수로) 등의 실험조건에 대하여 해양쓰레기 분류 기준 및 통계를 바탕으로 부유쓰레기 종류 선정하여 학습을 위한 데이터를 수집하였다. 학습 목적에 따라 구분하여 라벨링(Bounding box, Polygon)을 수행하고, 각 분석 기법별 전이학습을 통해 Phase 1(정수조), Phase 2(소하천), Phase 3(급경사수로) 순서로 모델을 고도화하였다. 성상 분석을 위해 YOLO v4를 활용하여 Train, Test DataSet(9:1)을 구성하고 학습 및 평가는 Iteration마다의 mAP, loss 값을 통해 비교하였으며, 학습 Phase에 따라 모델 고도화로 Test Set의 mAP 값이 성상별로 높아짐을 확인하였으며, 차집량 분석을 위해 Unet을 활용하여 Train, Test, Validation DataSet(8.5:1:0.5)을 구성하고 epoch별 IoU(intersection over Union), F1-score, loss 값을 비교하여 정성적, 정량적 평가 모두 Phase 3에서 가장 높은 성능을 확인하였다. 향후 하천 환경에서의 다양한 영양인자별 분석을 통해 주요 영향인자 도출 및 Hyper Parameter 최적화를 통한 모델 고도화로 인해 활용성이 높아질 것으로 판단된다.

  • PDF

Measurement of Quality Parameters of Honey by Reflectance Spectra

  • Park, Chang-Hyun;Yang, Won-Jun;Sohn, Jae-Hyung;Kim, Jong-Hoon
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1530-1530
    • /
    • 2001
  • The objectives of this study were to develop models to predict quality parameters of Korean bee-honeys by visible and NIR spectroscopic technique. Two kinds of bee-honey fronl acacia and polyflower sources were tested in this study. The honeys were harvested in the spring of 2000 and stored in the storage facility at 20$^{\circ}C$ during experiments. Total of 394 samples of honey were analyzed. Reflectance spectra, moisture contents, ash, invert sugar, sucrose, F/G (fructose/glucose) ratio, HMF (hydroxymethyl furfural), and C12/C13 ratio of honeys were measured. The average values for the tested honeys were 19.9% of moisture contents, 0.12% of ash, 68.4% of invert sugar, 5.7% of sucrose, 1.27 of F/G(fructose/glucose) ratio, 14.4 mg/kg of HMF, and -19.1 of C12/C13 ratio. A spectrophotometer, equipped with a single-beam scanning monochromator (NIR Systems, Model 6500, USA) and a horizontal setup module, was used to collect reflectance data from honey. The reflectance spectra were measured in wavelength ranges of 400∼2,498 nm. with 2 nm of interval. Thirty-two repetitive scans were averaged, transformed to log(1/Reflectance), and then were stored in a microcomputer file, forming one spectrum per measurement. A sample cell and reflectance plate were made to hold honey samples constantly. Spectra of honey samples were divided into a calibration set and a validation set. The calibration set was used during model development, and the validation set was used to predict quality parameters from unknown spectra. The PLS(Partial Least Square) models were developed to predict the quality parameters of honeys. The first and the second derivatives of raw spectra were also used to develop the models with proper smoothing gap. The MSC (multiplicative scatter correction) and the SNV & Dtr.(standard normal variate and detranding) preprocessing were applied to all spectra to minimize sample-to-sample light scatter differences. The PLS models showed good relationships between predicted and measured quality parameters of honeys in the wavelength range of 1100∼2200 nm. However, the PLS analysis was not good enough to predict HMF of honeys.

  • PDF

Automatic Detection of Type II Solar Radio Burst by Using 1-D Convolution Neutral Network

  • Kyung-Suk Cho;Junyoung Kim;Rok-Soon Kim;Eunsu Park;Yuki Kubo;Kazumasa Iwai
    • Journal of The Korean Astronomical Society
    • /
    • v.56 no.2
    • /
    • pp.213-224
    • /
    • 2023
  • Type II solar radio bursts show frequency drifts from high to low over time. They have been known as a signature of coronal shock associated with Coronal Mass Ejections (CMEs) and/or flares, which cause an abrupt change in the space environment near the Earth (space weather). Therefore, early detection of type II bursts is important for forecasting of space weather. In this study, we develop a deep-learning (DL) model for the automatic detection of type II bursts. For this purpose, we adopted a 1-D Convolution Neutral Network (CNN) as it is well-suited for processing spatiotemporal information within the applied data set. We utilized a total of 286 radio burst spectrum images obtained by Hiraiso Radio Spectrograph (HiRAS) from 1991 and 2012, along with 231 spectrum images without the bursts from 2009 to 2015, to recognizes type II bursts. The burst types were labeled manually according to their spectra features in an answer table. Subsequently, we applied the 1-D CNN technique to the spectrum images using two filter windows with different size along time axis. To develop the DL model, we randomly selected 412 spectrum images (80%) for training and validation. The train history shows that both train and validation losses drop rapidly, while train and validation accuracies increased within approximately 100 epoches. For evaluation of the model's performance, we used 105 test images (20%) and employed a contingence table. It is found that false alarm ratio (FAR) and critical success index (CSI) were 0.14 and 0.83, respectively. Furthermore, we confirmed above result by adopting five-fold cross-validation method, in which we re-sampled five groups randomly. The estimated mean FAR and CSI of the five groups were 0.05 and 0.87, respectively. For experimental purposes, we applied our proposed model to 85 HiRAS type II radio bursts listed in the NGDC catalogue from 2009 to 2016 and 184 quiet (no bursts) spectrum images before and after the type II bursts. As a result, our model successfully detected 79 events (93%) of type II events. This results demonstrates, for the first time, that the 1-D CNN algorithm is useful for detecting type II bursts.

Predicting nutrient excretion from dairy cows on smallholder farms in Indonesia using readily available farm data

  • Al Zahra, Windi;van Middelaar, Corina E.;de Boer, Imke J.M;Oosting, Simon J.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.33 no.12
    • /
    • pp.2039-2049
    • /
    • 2020
  • Objective: This study was conducted to provide models to accurately predict nitrogen (N) and phosphorus (P) excretion of dairy cows on smallholder farms in Indonesia based on readily available farm data. Methods: The generic model in this study is based on the principles of the Lucas equation, describing the relation between dry matter intake (DMI) and faecal N excretion to predict the quantity of faecal N (QFN). Excretion of urinary N and faecal P were calculated based on National Research Council recommendations for dairy cows. A farm survey was conducted to collect input parameters for the models. The data set was used to calibrate the model to predict QFN for the specific case. The model was validated by comparing the predicted quantity of faecal N with the actual quantity of faecal N (QFNACT) based on measurements, and the calibrated model was compared to the Lucas equation. The models were used to predict N and P excretion of all 144 dairy cows in the data set. Results: Our estimate of true N digestibility equalled the standard value of 92% in the original Lucas equation, whereas our estimate of metabolic faecal N was -0.60 g/100 g DMI, with the standard value being -0.61 g/100 g DMI. Results of the model validation showed that the R2 was 0.63, the MAE was 15 g/animal/d (17% from QFNACT), and the RMSE was 20 g/animal/d (22% from QFNACT). We predicted that the total N excretion of dairy cows in Indonesia was on average 197 g/animal/d, whereas P excretion was on average 56 g/animal/d. Conclusion: The proposed models can be used with reasonable accuracy to predict N and P excretion of dairy cattle on smallholder farms in Indonesia, which can contribute to improving manure management and reduce environmental issues related to nutrient losses.