• Title/Summary/Keyword: cross-validation

Search Result 1,016, Processing Time 0.027 seconds

Nondestructive Prediction of Fatty Acid Composition in Sesame Seeds by Near Infrared Reflectance Spectroscopy

  • Kim, Kwan-Su;Park, Si-Hyung;Choung, Myoung-Gun;Kim, Sun-Lim
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.51 no.spc1
    • /
    • pp.304-309
    • /
    • 2006
  • Near infrared reflectance spectroscopy (NIRS) was used to develop a rapid and nondestructive method for the determination of fatty acid composition in sesame (Sesamum indicum L.) seed oil. A total of ninety-three samples of intact seeds were scanned in the reflectance mode of a scanning monochromator, and reference values for fatty acid composition were measured by gas-liquid chromatography. Calibration equations were developed using modified partial least square regression with internal cross validation (n=63). The equations obtained had low standard errors of cross-validation and moderate $R^2$ (coefficient of determination in calibration). Prediction of an external validation set (n=30) showed significant correlation between reference values and NIRS estimated values based on the SEP (standard error of prediction), $r^2$ (coefficient of determination in prediction) and the ratio of standard deviation (SD) of reference data to SEP. The models developed in this study had relatively higher values (more than 2.0) of SD/SEP(C) for oleic and linoleic acid, having good correlation between reference and NIRS estimate. The results indicated that NIRS, a nondestructive screening method could be used to rapidly determine fatty acid composition in sesame seeds in the breeding programs for high quality sesame oil.

An Availability of Low Cost Sensors for Machine Fault Diagnosis

  • SON, JONG-DUK
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2012.10a
    • /
    • pp.394-399
    • /
    • 2012
  • In recent years, MEMS sensors show huge attraction in machine condition monitoring, which have advantages in power, size, cost, mobility and flexibility. They can integrate with smart sensors and MEMS sensors are batch product. So the prices are cheap. And the suitability of it for condition monitoring is researched by experimental study. This paper presents a comparative study and performance test of classification of MEMS sensors in target machine fault classification by 3 intelligent classifiers. We attempt to signal validation of MEMS sensor accuracy and reliability and performance comparisons of classifiers are conducted. MEMS accelerometer and MEMS current sensors are employed for experiment test. In addition, a simple feature extraction and cross validation methods were applied to make sure MEMS sensors availabilities. The result of application is good for using fault classification.

  • PDF

Feasibility study of deep learning based radiosensitivity prediction model of National Cancer Institute-60 cell lines using gene expression

  • Kim, Euidam;Chung, Yoonsun
    • Nuclear Engineering and Technology
    • /
    • v.54 no.4
    • /
    • pp.1439-1448
    • /
    • 2022
  • Background: We investigated the feasibility of in vitro radiosensitivity prediction with gene expression using deep learning. Methods: A microarray gene expression of the National Cancer Institute-60 (NCI-60) panel was acquired from the Gene Expression Omnibus. The clonogenic surviving fractions at an absorbed dose of 2 Gy (SF2) from previous publications were used to measure in vitro radiosensitivity. The radiosensitivity prediction model was based on the convolutional neural network. The 6-fold cross-validation (CV) was applied to train and validate the model. Then, the leave-one-out cross-validation (LOOCV) was applied by using the large-errored samples as a validation set, to determine whether the error was from the high bias of the folded CV. The criteria for correct prediction were defined as an absolute error<0.01 or a relative error<10%. Results: Of the 174 triplicated samples of NCI-60, 171 samples were correctly predicted with the folded CV. Through an additional LOOCV, one more sample was correctly predicted, representing a prediction accuracy of 98.85% (172 out of 174 samples). The average relative error and absolute errors of 172 correctly predicted samples were 1.351±1.875% and 0.00596±0.00638, respectively. Conclusion: We demonstrated the feasibility of a deep learning-based in vitro radiosensitivity prediction using gene expression.

Exploring Machine Learning Classifiers for Breast Cancer Classification

  • Inayatul Haq;Tehseen Mazhar;Hinna Hafeez;Najib Ullah;Fatma Mallek;Habib Hamam
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.860-880
    • /
    • 2024
  • Breast cancer is a major health concern affecting women and men globally. Early detection and accurate classification of breast cancer are vital for effective treatment and survival of patients. This study addresses the challenge of accurately classifying breast tumors using machine learning classifiers such as MLP, AdaBoostM1, logit Boost, Bayes Net, and the J48 decision tree. The research uses a dataset available publicly on GitHub to assess the classifiers' performance and differentiate between the occurrence and non-occurrence of breast cancer. The study compares the 10-fold and 5-fold cross-validation effectiveness, showing that 10-fold cross-validation provides superior results. Also, it examines the impact of varying split percentages, with a 66% split yielding the best performance. This shows the importance of selecting appropriate validation techniques for machine learning-based breast tumor classification. The results also indicate that the J48 decision tree method is the most accurate classifier, providing valuable insights for developing predictive models for cancer diagnosis and advancing computational medical research.

Doubly penalized kernel method for heteroscedastic autoregressive datay

  • Cho, Dae-Hyeon;Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.155-162
    • /
    • 2010
  • In this paper we propose a doubly penalized kernel method which estimates both the mean function and the variance function simultaneously by kernel machines for heteroscedastic autoregressive data. We also present the model selection method which employs the cross validation techniques for choosing the hyper-parameters which aect the performance of proposed method. Simulated examples are provided to indicate the usefulness of proposed method for the estimation of mean and variance functions.

Estimation and variable selection in censored regression model with smoothly clipped absolute deviation penalty

  • Shim, Jooyong;Bae, Jongsig;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.6
    • /
    • pp.1653-1660
    • /
    • 2016
  • Smoothly clipped absolute deviation (SCAD) penalty is known to satisfy the desirable properties for penalty functions like as unbiasedness, sparsity and continuity. In this paper, we deal with the regression function estimation and variable selection based on SCAD penalized censored regression model. We use the local linear approximation and the iteratively reweighted least squares algorithm to solve SCAD penalized log likelihood function. The proposed method provides an efficient method for variable selection and regression function estimation. The generalized cross validation function is presented for the model selection. Applications of the proposed method are illustrated through the simulated and a real example.

Censored varying coefficient regression model using Buckley-James method

  • Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1167-1177
    • /
    • 2017
  • The censored regression using the pseudo-response variable proposed by Buckley and James has been one of the most well-known models. Recently, the varying coefficient regression model has received a great deal of attention as an important tool for modeling. In this paper we propose a censored varying coefficient regression model using Buckley-James method to consider situations where the regression coefficients of the model are not constant but change as the smoothing variables change. By using the formulation of least squares support vector machine (LS-SVM), the coefficient estimators of the proposed model can be easily obtained from simple linear equations. Furthermore, a generalized cross validation function can be easily derived. In this paper, we evaluated the proposed method and demonstrated the adequacy through simulate data sets and real data sets.

Spatial-Temporal Modelling of Road Traffic Data in Seoul City

  • Lee, Sang-Yeol;Ahn, Soo-Han;Park, Chang-Yi;Jeon, Jong-Woo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.261-270
    • /
    • 2002
  • Recently, the demand of the Intelligent Transportation System(ITS) has been increased to a large extent, and a real-time traffic information service based on the internet system became very important. When ITS companies carry out real-time traffic services, they find some traffic data missing, and use the conventional method of reconstructing missing values by calculating average time trend. However, the method is found unsatisfactory, so that we develop a new method based the spatial and spatial-temporal models. A cross-validation technique shows that the spatial-temporal model outperforms the others.

  • PDF

Semiparametric Kernel Poisson Regression for Longitudinal Count Data

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.6
    • /
    • pp.1003-1011
    • /
    • 2008
  • Mixed-effect Poisson regression models are widely used for analysis of correlated count data such as those found in longitudinal studies. In this paper, we consider kernel extensions with semiparametric fixed effects and parametric random effects. The estimation is through the penalized likelihood method based on kernel trick and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of hyperparameters, cross-validation techniques are employed. Examples illustrating usage and features of the proposed method are provided.

Logistic Regression Method in Interval-Censored Data

  • Yun, Eun-Young;Kim, Jin-Mi;Ki, Choong-Rak
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.871-881
    • /
    • 2011
  • In this paper we propose a logistic regression method to estimate the survival function and the median survival time in interval-censored data. The proposed method is motivated by the data augmentation technique with no sacrifice in augmenting data. In addition, we develop a cross validation criterion to determine the size of data augmentation. We compare the proposed estimator with other existing methods such as the parametric method, the single point imputation method, and the nonparametric maximum likelihood estimator through extensive numerical studies to show that the proposed estimator performs better than others in the sense of the mean squared error. An illustrative example based on a real data set is given.