• Title/Summary/Keyword: Cross-Validation

Search Result 990, Processing Time 0.034 seconds

CROSS-VALIDATION OF ARTIFICIAL NEURAL NETWORK FOR LANDSLIDE SUSCEPTIBILITY ANALYSIS: A CASE STUDY OF KOREA

  • LEE SARO;LEE MOUNG-JIN;WON JOONG-SUN
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.298-301
    • /
    • 2004
  • The aim of this study is to cross-validate of spatial probability model, artificial neural network at Boun, Korea, using a Geographic Information System (GIS). Landslide locations were identified in the Boun, Janghung and Youngin areas from interpretation of aerial photographs, field surveys, and maps of the topography, soil type, forest cover and land use were constructed to spatial data-sets. The factors that influence landslide occurrence, such as slope, aspect and curvature of topography, were calculated from the topographic database. Topographic type, texture, material, drainage and effective soil thickness were extracted from the soil database, and type, diameter, age and density of forest were extracted from the forest database. Lithology was extracted from the geological database, and land use was classified from the Landsat TM image satellite image. Landslide susceptibility was analyzed using the landslide­occurrence factors by artificial neural network model. For the validation and cross-validation, the result of the analysis was applied to each study areas. The validation and cross-validate results showed satisfactory agreement between the susceptibility map and the existing data on landslide locations.

  • PDF

A Cross-Validation of SeismicVulnerability Assessment Model: Application to Earthquake of 9.12 Gyeongju and 2017 Pohang (지진 취약성 평가 모델 교차검증: 경주(2016)와 포항(2017) 지진을 대상으로)

  • Han, Jihye;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.3
    • /
    • pp.649-655
    • /
    • 2021
  • This study purposes to cross-validate its performance by applying the optimal seismic vulnerability assessment model based on previous studies conducted in Gyeongju to other regions. The test area was Pohang City, the occurrence site for the 2017 Pohang Earthquake, and the dataset was built the same influencing factors and earthquake-damaged buildings as in the previous studies. The validation dataset was built via random sampling, and the prediction accuracy was derived by applying it to a model based on a random forest (RF) of Gyeongju. The accuracy of the model success and prediction in Gyeongju was 100% and 94.9%, respectively, and as a result of confirming the prediction accuracy by applying the Pohang validation dataset, it appeared as 70.4%.

Traffic Classification Using Machine Learning Algorithms in Practical Network Monitoring Environments (실제 네트워크 모니터링 환경에서의 ML 알고리즘을 이용한 트래픽 분류)

  • Jung, Kwang-Bon;Choi, Mi-Jung;Kim, Myung-Sup;Won, Young-J.;Hong, James W.
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.8B
    • /
    • pp.707-718
    • /
    • 2008
  • The methodology of classifying traffics is changing from payload based or port based to machine learning based in order to overcome the dynamic changes of application's characteristics. However, current state of traffic classification using machine learning (ML) algorithms is ongoing under the offline environment. Specifically, most of the current works provide results of traffic classification using cross validation as a test method. Also, they show classification results based on traffic flows. However, these traffic classification results are not useful for practical environments of the network traffic monitoring. This paper compares the classification results using cross validation with those of using split validation as the test method. Also, this paper compares the classification results based on flow to those based on bytes. We classify network traffics by using various feature sets and machine learning algorithms such as J48, REPTree, RBFNetwork, Multilayer perceptron, BayesNet, and NaiveBayes. In this paper, we find the best feature sets and the best ML algorithm for classifying traffics using the split validation.

Validation Technique using variance and confidence interval of metamodel (근사모델의 분산과 신뢰구간을 이용한 모델의 정확도 평가법)

  • Han, In-Sik;Lee, Yong-Bin;Choi, Dong-Hoon
    • Proceedings of the KSME Conference
    • /
    • 2008.11a
    • /
    • pp.1169-1175
    • /
    • 2008
  • The validation technique is classified with two methods whether to demand of additional experimental points. The method which requires additional experimental points such as RSME is actually impossible in engineering field. Therefore, the method which only use experimented points such as the cross validation technique is only available. But the cross validation not only requires considerable computational costs for generating metamodel each iterations, but also cannot measure quantitatively the fidelity of metamodel. In this research we propose a new validation technique for representative metamodels using an variance of metamodel and confidence interval information. The proposed validation technique computes confidence intervals using a variance information from the metamodel. This technique will have influence on choosing the accurate metamodel, constructing ensemble of each metamodels and advancing effectively sequential sampling technique.

  • PDF

Mean-Variance-Validation Technique for Sequential Kriging Metamodels (순차적 크리깅모델의 평균-분산 정확도 검증기법)

  • Lee, Tae-Hee;Kim, Ho-Sung
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.34 no.5
    • /
    • pp.541-547
    • /
    • 2010
  • The rigorous validation of the accuracy of metamodels is an important topic in research on metamodel techniques. Although a leave-k-out cross-validation technique involves a considerably high computational cost, it cannot be used to measure the fidelity of metamodels. Recently, the mean$_0$ validation technique has been proposed to quantitatively determine the accuracy of metamodels. However, the use of mean$_0$ validation criterion may lead to premature termination of a sampling process even if the kriging model is inaccurate. In this study, we propose a new validation technique based on the mean and variance of the response evaluated when sequential sampling method, such as maximum entropy sampling, is used. The proposed validation technique is more efficient and accurate than the leave-k-out cross-validation technique, because instead of performing numerical integration, the kriging model is explicitly integrated to accurately evaluate the mean and variance of the response evaluated. The error in the proposed validation technique resembles a root mean squared error, thus it can be used to determine a stop criterion for sequential sampling of metamodels.

An Efficient Protocol for the Cross Certification Path Validation (경로기반 상호인증을 위한 효율적 프로토콜)

  • 김홍석;박세현
    • Proceedings of the IEEK Conference
    • /
    • 2000.06a
    • /
    • pp.217-220
    • /
    • 2000
  • With the expansion of E-commerce, Public Key Infrastructure (PKI) solutions are requited to resolve Internet security problems. But the certification mechanism for each organization has been independently developed under its own circumstances, so the cooperation of heterogeneous certification mechanisms must be carefully taken into account. In this paper, we propose an efficient protocol for the cross certification based on the path validation. The proposed “cross certification gateway” provides flexibility and convenience with the initial establishment protocol for the cross certification among different certification domains.

  • PDF

Developing a Molecular Prognostic Predictor of a Cancer based on a Small Sample

  • Kim Inyoung;Lee Sunho;Rha Sun Young;Kim Byungsoo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2004.11a
    • /
    • pp.195-198
    • /
    • 2004
  • One Important problem in a cancer microarray study is to identify a set of genes from which a molecular prognostic indicator can be developed. In parallel with this problem is to validate the chosen set of genes. We develop in this note a K-fold cross validation procedure by combining a 'pre-validation' technique and a bootstrap resampling procedure in the Cox regression . The pre-validation technique predicts the microarray predictor of a case without having seen the true class level of the case. It was suggested by Tibshirani and Efron (2002) to avoid the possible over-fitting in the regression in which a microarray based predictor is employed. The bootstrap resampling procedure for the Cox regression was proposed by Sauerbrei and Schumacher (1992) as a means of overcoming the instability of a stepwise selection procedure. We apply this K-fold cross validation to the microarray data of 92 gastric cancers of which the experiment was conducted at Cancer Metastasis Research Center, Yonsei University. We also share some of our experience on the 'false positive' result due to the information leak.

  • PDF

APPLICATION AND CROSS-VALIDATION OF SPATIAL LOGISTIC MULTIPLE REGRESSION FOR LANDSLIDE SUSCEPTIBILITY ANALYSIS

  • LEE SARO
    • Proceedings of the KSRS Conference
    • /
    • 2004.10a
    • /
    • pp.302-305
    • /
    • 2004
  • The aim of this study is to apply and crossvalidate a spatial logistic multiple-regression model at Boun, Korea, using a Geographic Information System (GIS). Landslide locations in the Boun area were identified by interpretation of aerial photographs and field surveys. Maps of the topography, soil type, forest cover, geology, and land-use were constructed from a spatial database. The factors that influence landslide occurrence, such as slope, aspect, and curvature of topography, were calculated from the topographic database. Texture, material, drainage, and effective soil thickness were extracted from the soil database, and type, diameter, and density of forest were extracted from the forest database. Lithology was extracted from the geological database and land-use was classified from the Landsat TM image satellite image. Landslide susceptibility was analyzed using landslide-occurrence factors by logistic multiple-regression methods. For validation and cross-validation, the result of the analysis was applied both to the study area, Boun, and another area, Youngin, Korea. The validation and cross-validation results showed satisfactory agreement between the susceptibility map and the existing data with respect to landslide locations. The GIS was used to analyze the vast amount of data efficiently, and statistical programs were used to maintain specificity and accuracy.

  • PDF

On Practical Choice of Smoothing Parameter in Nonparametric Classification (베이즈 리스크를 이용한 커널형 분류에서 평활모수의 선택)

  • Kim, Rae-Sang;Kang, Kee-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.283-292
    • /
    • 2008
  • Smoothing parameter or bandwidth plays a key role in nonparametric classification based on kernel density estimation. We consider choosing smoothing parameter in nonparametric classification, which optimize the Bayes risk. Hall and Kang (2005) clarified the theoretical properties of smoothing parameter in terms of minimizing Bayes risk and derived the optimal order of it. Bootstrap method was used in their exploring numerical properties. We compare cross-validation and bootstrap method numerically in terms of optimal order of bandwidth. Effects on misclassification rate are also examined. We confirm that bootstrap method is superior to cross-validation in both cases.