• Title/Summary/Keyword: 교차검증

Search Result 667, Processing Time 0.027 seconds

Application of Time-series Cross Validation in Hyperparameter Tuning of a Predictive Model for 2,3-BDO Distillation Process (시계열 교차검증을 적용한 2,3-BDO 분리공정 온도예측 모델의 초매개변수 최적화)

  • An, Nahyeon;Choi, Yeongryeol;Cho, Hyungtae;Kim, Junghwan
    • Korean Chemical Engineering Research
    • /
    • v.59 no.4
    • /
    • pp.532-541
    • /
    • 2021
  • Recently, research on the application of artificial intelligence in the chemical process has been increasing rapidly. However, overfitting is a significant problem that prevents the model from being generalized well to predict unseen data on test data, as well as observed training data. Cross validation is one of the ways to solve the overfitting problem. In this study, the time-series cross validation method was applied to optimize the number of batch and epoch in the hyperparameters of the prediction model for the 2,3-BDO distillation process, and it compared with K-fold cross validation generally used. As a result, the RMSE of the model with time-series cross validation was lower by 9.06%, and the MAPE was higher by 0.61% than the model with K-fold cross validation. Also, the calculation time was 198.29 sec less than the K-fold cross validation method.

Cross-Validated Ensemble Methods in Natural Language Inference (자연어 추론에서의 교차 검증 앙상블 기법)

  • Yang, Kisu;Whang, Taesun;Oh, Dongsuk;Park, Chanjun;Lim, Heuiseok
    • Annual Conference on Human and Language Technology
    • /
    • 2019.10a
    • /
    • pp.8-11
    • /
    • 2019
  • 앙상블 기법은 여러 모델을 종합하여 최종 판단을 산출하는 기계 학습 기법으로서 딥러닝 모델의 성능 향상을 보장한다. 하지만 대부분의 기법은 앙상블만을 위한 추가적인 모델 또는 별도의 연산을 요구한다. 이에 우리는 앙상블 기법을 교차 검증 방법과 결합하여 앙상블 연산을 위한 비용을 줄이며 일반화 성능을 높이는 교차 검증 앙상블 기법을 제안한다. 본 기법의 효과를 입증하기 위해 MRPC, RTE 데이터셋과 BiLSTM, CNN, BERT 모델을 이용하여 기존 앙상블 기법보다 향상된 성능을 보인다. 추가로 교차 검증에서 비롯한 일반화 원리와 교차 검증 변수에 따른 성능 변화에 대하여 논의한다.

  • PDF

Region of Interest (ROI) Selection of Land Cover Using SVM Cross Validation (SVM 교차검증을 활용한 토지피복 ROI 선정)

  • Jeong, Jong-Chul;Youn, Hyoung-Jin
    • Journal of Cadastre & Land InformatiX
    • /
    • v.50 no.1
    • /
    • pp.75-85
    • /
    • 2020
  • This study examines machine learning cross-validation to utilized create ROI for classification of land cover. The study area located in Sejong and one KOMPSAT-3A image was used in this analysis: procedure on October 28, 2019. We used four bands(Red, Green, Blue, Near infra-red) for learning cross validation process. In this study, we used K-fold method in cross validation and used SVM kernel type with cross validation result. In addition, we used 4 kernels of SVM(Linear, Polynomial, RBF, Sigmoid) for supervised classification land cover map using extracted ROI. During the cross validation process, 1,813 data extracted from 3,500 data, and the most of the building, road and grass class data were removed about 60% during cross validation process. Based on this, the supervised SVM linear technique showed the highest classification accuracy of 91.77% compared to other kernel methods. The grass' producer accuracy showed 79.43% and identified a large mis-classification in forests. Depending on the results of the study, extraction ROI using cross validation may be effective in forest, water and agriculture areas, but it is deemed necessary to improve the distinction of built-up, grass and bare-soil area.

Detecting Errors in Dependency Treebank through XGBoost and Cross Validation (XGBoost와 교차 검증을 이용한 구문분석 말뭉치에서의 오류 탐지)

  • Choi, Min-Seok;Kim, Chang-Hyun;Cheon, Min-Ah;Park, Hyuk-Ro;Kim, Jae-Hoon
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.103-107
    • /
    • 2020
  • 의존구조 말뭉치는 자연언어처리 분야에서 문장의 의존관계를 파악하는데 널리 사용된다. 이러한 말뭉치는 일반적으로 오류가 없다고 가정하지만, 현실적으로는 다양한 오류를 포함하고 있다. 이러한 오류들은 성능 저하의 요인이 된다. 이러한 문제를 완화하려고 본 논문에서는 XGBoost와 교차검증을 이용하여 이미 구축된 구문분석 말뭉치로부터 오류를 탐지하는 방법을 제안한다. 그러나 오류가 부착된 학습말뭉치가 존재하지 않으므로, 일반적인 분류기로서 오류를 검출할 수 없다. 본 논문에서는 분류기의 결과를 분석하여 오류를 검출하는 방법을 제안한다. 성능을 분석하려고 표본집단과 모집단의 오류 분포의 차이를 분석하였고 표본집단과 모집단의 오류 분포의 차이가 거의 없는 것으로 보아 제안된 방법이 타당함을 알 수 있었다. 앞으로 의미역 부착 말뭉치에 적용할 계획이다.

  • PDF

Development of Highway Traffic Information Prediction Models Using the Stacking Ensemble Technique Based on Cross-validation (스태킹 앙상블 기법을 활용한 고속도로 교통정보 예측모델 개발 및 교차검증에 따른 성능 비교)

  • Yoseph Lee;Seok Jin Oh;Yejin Kim;Sung-ho Park;Ilsoo Yun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.6
    • /
    • pp.1-16
    • /
    • 2023
  • Accurate traffic information prediction is considered to be one of the most important aspects of intelligent transport systems(ITS), as it can be used to guide users of transportation facilities to avoid congested routes. Various deep learning models have been developed for accurate traffic prediction. Recently, ensemble techniques have been utilized to combine the strengths and weaknesses of various models in various ways to improve prediction accuracy and stability. Therefore, in this study, we developed and evaluated a traffic information prediction model using various deep learning models, and evaluated the performance of the developed deep learning models as a stacking ensemble. The individual models showed error rates within 10% for traffic volume prediction and 3% for speed prediction. The ensemble model showed higher accuracy compared to other models when no cross-validation was performed, and when cross-validation was performed, it showed a uniform error rate in long-term forecasting.

Candidate Points and Representative Cross-Validation Approach for Sequential Sampling (후보점과 대표점 교차검증에 의한 순차적 실험계획)

  • Kim, Seung-Won;Jung, Jae-Jun;Lee, Tae-Hee
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.31 no.1 s.256
    • /
    • pp.55-61
    • /
    • 2007
  • Recently simulation model becomes an essential tool for analysis and design of a system but it is often expensive and time consuming as it becomes complicate to achieve reliable results. Therefore, high-fidelity simulation model needs to be replaced by an approximate model, the so-called metamodel. Metamodeling techniques include 3 components of sampling, metamodel and validation. Cross-validation approach has been proposed to provide sequnatially new sample point based on cross-validation error but it is very expensive because cross-validation must be evaluated at each stage. To enhance the cross-validation of metamodel, sequential sampling method using candidate points and representative cross-validation is proposed in this paper. The candidate and representative cross-validation approach of sequential sampling is illustrated for two-dimensional domain. To verify the performance of the suggested sampling technique, we compare the accuracy of the metamodels for various mathematical functions with that obtained by conventional sequential sampling strategies such as maximum distance, mean squared error, and maximum entropy sequential samplings. Through this research we team that the proposed approach is computationally inexpensive and provides good prediction performance.

An Intersection Validation and Interference Elimination Algorithm between Weapon Trajectories in Multi-target and Multi-weapon Environments (다표적-다무장 환경에서 무장 궤적 간 교차 검증 및 간섭 배제 알고리즘)

  • Yoon, Moonhyung;Park, Junho;Yi, JeongHoon;Kim, Kapsoo;Koo, BongJoo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.9
    • /
    • pp.614-622
    • /
    • 2018
  • As multiple weapons are fired simultaneously in multi-target and multi-weapon environments, a possibility always exists in the collision occurred by the intersection between weapon trajectories. The collision between weapons not only hinders the rapid reaction but also causes the loss of the asset of weapons of friendly force to weaken the responsive power against the threat by an enemy. In this paper, we propose an intersection validation and interference elimination algorithm between weapon trajectories in multi-target and multi-weapon environments. The core points of our algorithm are to confirm the possible interference through the analysis on the intersections between weapon trajectories and to eliminate the mutual interference. To show the superiority of our algorithm, we implement the evaluation and verification of performances through the simulation and visualization of our algorithm. Our experimental results show that the proposed algorithm performs effectively the interference elimination regardless of the number of targets and weapon groups by showing that no cross point exists.

A Study on Random Selection of Pooling Operations for Regularization and Reduction of Cross Validation (정규화 및 교차검증 횟수 감소를 위한 무작위 풀링 연산 선택에 관한 연구)

  • Ryu, Seo-Hyeon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.4
    • /
    • pp.161-166
    • /
    • 2018
  • In this paper, we propose a method for the random selection of pooling operations for the regularization and reduction of cross validation in convolutional neural networks. The pooling operation in convolutional neural networks is used to reduce the size of the feature map and for its shift invariant properties. In the existing pooling method, one pooling operation is applied in each pooling layer. Because this method fixes the convolution network, the network suffers from overfitting, which means that it excessively fits the models to the training samples. In addition, to find the best combination of pooling operations to maximize the performance, cross validation must be performed. To solve these problems, we introduce the probability concept into the pooling layers. The proposed method does not select one pooling operation in each pooling layer. Instead, we randomly select one pooling operation among multiple pooling operations in each pooling region during training, and for testing purposes, we use probabilistic weighting to produce the expected output. The proposed method can be seen as a technique in which many networks are approximately averaged using a different pooling operation in each pooling region. Therefore, this method avoids the overfitting problem, as well as reducing the amount of cross validation. The experimental results show that the proposed method can achieve better generalization performance and reduce the need for cross validation.

Mean-Variance-Validation Technique for Sequential Kriging Metamodels (순차적 크리깅모델의 평균-분산 정확도 검증기법)

  • Lee, Tae-Hee;Kim, Ho-Sung
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.34 no.5
    • /
    • pp.541-547
    • /
    • 2010
  • The rigorous validation of the accuracy of metamodels is an important topic in research on metamodel techniques. Although a leave-k-out cross-validation technique involves a considerably high computational cost, it cannot be used to measure the fidelity of metamodels. Recently, the mean$_0$ validation technique has been proposed to quantitatively determine the accuracy of metamodels. However, the use of mean$_0$ validation criterion may lead to premature termination of a sampling process even if the kriging model is inaccurate. In this study, we propose a new validation technique based on the mean and variance of the response evaluated when sequential sampling method, such as maximum entropy sampling, is used. The proposed validation technique is more efficient and accurate than the leave-k-out cross-validation technique, because instead of performing numerical integration, the kriging model is explicitly integrated to accurately evaluate the mean and variance of the response evaluated. The error in the proposed validation technique resembles a root mean squared error, thus it can be used to determine a stop criterion for sequential sampling of metamodels.

Domestic air demand forecast using cross-validation (교차검증을 이용한 국내선 항공수요예측)

  • Lim, Jae-Hwan;Kim, Young-Rok;Choi, Yun-Chul;Kim, Kwang-Il
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.27 no.1
    • /
    • pp.43-50
    • /
    • 2019
  • The aviation demand forecast field has been actively studied along with the recent growth of the aviation market. In this study, the demand for domestic passenger demand and freight demand was estimated through cross-validation method. As a result, passenger demand is influenced by private consumption growth rate, oil price, and exchange rate. Freight demand is affected by GDP per capita, private consumption growth rate, and oil price. In particular, passenger demand is characterized by temporary external shocks, and freight demand is more affected by economic variables than temporary shocks.