• Title/Summary/Keyword: cross-validation

Search Result 1,016, Processing Time 0.025 seconds

Improvement of Neural Network Performance for Estimating Defect Size of Steam Generator Tube using Multifold Cross-Validation (다중겹 교차검증 기법을 이용한 증기세관 결함크기 예측을 위한 신경회로망 성능 향상)

  • Kim, Nam-Jin;Jee, Su-Jung;Jo, Nam-Hoon
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.26 no.9
    • /
    • pp.73-79
    • /
    • 2012
  • In this paper, we study on how to determine the number of hidden layer neurons in neural network for predicting defect size of steam generator tube. It was reported in the literature that the number of hidden layer neurons can be efficiently determined with the help of cross-validation. Although the cross-validation provides decent estimation performance in most cases, the performance depends on the selection of validation set and rather poor performance may be led to in some cases. In order to avoid such a problem, we propose to use multifold cross-validation. Through the simulation study, it is shown that the estimation performance of defect width (defect depth, respectively) attains 94% (99.4%, respectively) of the best performance achievable among the considered neuron numbers.

Region of Interest (ROI) Selection of Land Cover Using SVM Cross Validation (SVM 교차검증을 활용한 토지피복 ROI 선정)

  • Jeong, Jong-Chul;Youn, Hyoung-Jin
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.75-85
    • /
    • 2020
  • This study examines machine learning cross-validation to utilized create ROI for classification of land cover. The study area located in Sejong and one KOMPSAT-3A image was used in this analysis: procedure on October 28, 2019. We used four bands(Red, Green, Blue, Near infra-red) for learning cross validation process. In this study, we used K-fold method in cross validation and used SVM kernel type with cross validation result. In addition, we used 4 kernels of SVM(Linear, Polynomial, RBF, Sigmoid) for supervised classification land cover map using extracted ROI. During the cross validation process, 1,813 data extracted from 3,500 data, and the most of the building, road and grass class data were removed about 60% during cross validation process. Based on this, the supervised SVM linear technique showed the highest classification accuracy of 91.77% compared to other kernel methods. The grass' producer accuracy showed 79.43% and identified a large mis-classification in forests. Depending on the results of the study, extraction ROI using cross validation may be effective in forest, water and agriculture areas, but it is deemed necessary to improve the distinction of built-up, grass and bare-soil area.

Cross-cultural Validation of Instruments Measuring Health Beliefs about Colorectal Cancer Screening among Korean Americans

  • Lee, Shin-Young;Lee, Eunice E.
    • Journal of Korean Academy of Nursing
    • /
    • v.45 no.1
    • /
    • pp.129-138
    • /
    • 2015
  • Purpose: The purpose of this study was to report the instrument modification and validation processes to make existing health belief model scales culturally appropriate for Korean Americans (KAs) regarding colorectal cancer (CRC) screening utilization. Methods: Instrument translation, individual interviews using cognitive interviewing, and expert reviews were conducted during the instrument modification phase, and a pilot test and a cross-sectional survey were conducted during the instrument validation phase. Data analyses of the cross-sectional survey included internal consistency and construct validity using exploratory and confirmatory factor analysis. Results: The main issues identified during the instrument modification phase were (a) cultural and linguistic translation issues and (b) newly developed items reflecting Korean cultural barriers. Cross-sectional survey analyses during the instrument validation phase revealed that all scales demonstrate good internal consistency reliability (Cronbach's alpha=.72~.88). Exploratory factor analysis showed that susceptibility and severity loaded on the same factor, which may indicate a threat variable. Items with low factor loadings in the confirmatory factor analysis may relate to (a) lack of knowledge about fecal occult blood testing and (b) multiple dimensions of the subscales. Conclusion: Methodological, sequential processes of instrument modification and validation, including translation, individual interviews, expert reviews, pilot testing and a cross-sectional survey, were provided in this study. The findings indicate that existing instruments need to be examined for CRC screening research involving KAs.

Penalized Likelihood Regression: Fast Computation and Direct Cross-Validation

  • Kim, Young-Ju;Gu, Chong
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.05a
    • /
    • pp.215-219
    • /
    • 2005
  • We consider penalized likelihood regression with exponential family responses. Parallel to recent development in Gaussian regression, the fast computation through asymptotically efficient low-dimensional approximations is explored, yielding algorithm that scales much better than the O($n^3$) algorithm for the exact solution. Also customizations of the direct cross-validation strategy for smoothing parameter selection in various distribution families are explored and evaluated.

  • PDF

Diagnostic In Spline Regression Model With Heteroscedasticity

  • Lee, In-Suk;Jung, Won-Tae;Jeong, Hye-Jeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.6 no.1
    • /
    • pp.63-71
    • /
    • 1995
  • We have consider the study of local influence for smoothing parameter estimates in spline regression model with heteroscedasticity. Practically, generalized cross-validation does not work well in the presence of heteroscedasticity. Thus we have proposed the local influence measure for generalized cross-validation estimates when errors are heteroscedastic. And we have examined effects of diagnostic by above measures through Hyperinflation data.

  • PDF

Robust Cross Validation Score

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.413-423
    • /
    • 2005
  • Consider the problem of estimating the underlying regression function from a set of noisy data which is contaminated by a long tailed error distribution. There exist several robust smoothing techniques and these are turned out to be very useful to reduce the influence of outlying observations. However, no matter what kind of robust smoother we use, we should choose the smoothing parameter and relatively less attention has been made for the robust bandwidth selection method. In this paper, we adopt the idea of robust location parameter estimation technique and propose the robust cross validation score functions.

GLOBAL GENERALIZED CROSS VALIDATION IN THE PRECONDITIONED GL-LSQR

  • Chung, Seiyoung;Oh, SeYoung;Kwon, SunJoo
    • Journal of the Chungcheong Mathematical Society
    • /
    • v.32 no.1
    • /
    • pp.149-156
    • /
    • 2019
  • This paper present the global generalized cross validation as the appropriate choice of the regularization parameter in the preconditioned Gl-LSQR method in solving image deblurring problems. The regularization parameter, chosen from the global generalized cross validation, with preconditioned Gl-LSQR method can give better reconstructions of the true image than other parameters considered in this study.

Cross-Validation method for Science and Technology Research Paper considering Interdisciplinary Approach (다학제적 접근을 고려한 과학기술논문 상호검증 방법)

  • Han, Young-shin
    • Journal of Engineering Education Research
    • /
    • v.18 no.5
    • /
    • pp.3-10
    • /
    • 2015
  • Researchers in science and technology has broadened the scope of research in order to solve complex problems, academic exchange has also been actively carried out. If the paper which is a mean of interdisciplinary approach has a limited term and the formula, it can act as barriers to access for many researchers in various fields. This paper proposes a cross-validation method for eliminating documentary barriers based on discrete event system formalism. We expect that our proposed method will improve a cross-validation considering researchers in another fields.

Estimating Prediction Errors in Binary Classification Problem: Cross-Validation versus Bootstrap

  • Kim Ji-Hyun;Cha Eun-Song
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.1
    • /
    • pp.151-165
    • /
    • 2006
  • It is important to estimate the true misclassification rate of a given classifier when an independent set of test data is not available. Cross-validation and bootstrap are two possible approaches in this case. In related literature bootstrap estimators of the true misclassification rate were asserted to have better performance for small samples than cross-validation estimators. We compare the two estimators empirically when the classification rule is so adaptive to training data that its apparent misclassification rate is close to zero. We confirm that bootstrap estimators have better performance for small samples because of small variance, and we have found a new fact that their bias tends to be significant even for moderate to large samples, in which case cross-validation estimators have better performance with less computation.

Multiclass LS-SVM ensemble for large data

  • Hwang, Hyungtae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1557-1563
    • /
    • 2015
  • Multiclass classification is typically performed using the voting scheme method based on combining binary classifications. In this paper we propose multiclass classification method for large data, which can be regarded as the revised one-vs-all method. The multiclass classification is performed by using the hat matrix of least squares support vector machine (LS-SVM) ensemble, which is obtained by aggregating individual LS-SVM trained on each subset of whole large data. The cross validation function is defined to select the optimal values of hyperparameters which affect the performance of multiclass LS-SVM proposed. We obtain the generalized cross validation function to reduce computational burden of cross validation function. Experimental results are then presented which indicate the performance of the proposed method.