• 제목/요약/키워드: paper bagging

검색결과 79건 처리시간 0.049초

다구찌 디자인을 이용한 앙상블 및 군집분석 분류 성능 비교 (Comparing Classification Accuracy of Ensemble and Clustering Algorithms Based on Taguchi Design)

  • 신형원;손소영
    • 대한산업공학회지
    • /
    • 제27권1호
    • /
    • pp.47-53
    • /
    • 2001
  • In this paper, we compare the classification performances of both ensemble and clustering algorithms (Data Bagging, Variable Selection Bagging, Parameter Combining, Clustering) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are (1) correlation among input variables (2) variance of observation (3) training data size and (4) input-output function. In view of the unknown relationship between input and output function, we use a Taguchi design to improve the practicality of our study results by letting it as a noise factor. Experimental study results indicate the following: When the level of the variance is medium, Bagging & Parameter Combining performs worse than Logistic Regression, Variable Selection Bagging and Clustering. However, classification performances of Logistic Regression, Variable Selection Bagging, Bagging and Clustering are not significantly different when the variance of input data is either small or large. When there is strong correlation in input variables, Variable Selection Bagging outperforms both Logistic Regression and Parameter combining. In general, Parameter Combining algorithm appears to be the worst at our disappointment.

  • PDF

Text-independent Speaker Identification by Bagging VQ Classifier

  • Kyung, Youn-Jeong;Park, Bong-Dae;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • 제20권2E호
    • /
    • pp.17-24
    • /
    • 2001
  • In this paper, we propose the bootstrap and aggregating (bagging) vector quantization (VQ) classifier to improve the performance of the text-independent speaker recognition system. This method generates multiple training data sets by resampling the original training data set, constructs the corresponding VQ classifiers, and then integrates the multiple VQ classifiers into a single classifier by voting. The bagging method has been proven to greatly improve the performance of unstable classifiers. Through two different experiments, this paper shows that the VQ classifier is unstable. In one of these experiments, the bias and variance of a VQ classifier are computed with a waveform database. The variance of the VQ classifier is compared with that of the classification and regression tree (CART) classifier[1]. The variance of the VQ classifier is shown to be as large as that of the CART classifier. The other experiment involves speaker recognition. The speaker recognition rates vary significantly by the minor changes in the training data set. The speaker recognition experiments involving a closed set, text-independent and speaker identification are performed with the TIMIT database to compare the performance of the bagging VQ classifier with that of the conventional VQ classifier. The bagging VQ classifier yields improved performance over the conventional VQ classifier. It also outperforms the conventional VQ classifier in small training data set problems.

  • PDF

과대지용 emulsion 발수제 및 처리기술 (Water Repelling Technology for Fruit bagging paper)

  • 윤경동;변종상;김대근;엄태진
    • 한국펄프종이공학회:학술대회논문집
    • /
    • 한국펄프종이공학회 2006년도 추계학술발표논문집
    • /
    • pp.207-212
    • /
    • 2006
  • The development of practical utilization way of water repelling technology for manufacturing of fruiting bag was main purpose of this study. The water repelling technology was developed by wax emulsion coating The wax emulsion was manufactured at Chungsan chemical(Co) on pilot scale. The coating paper was tried in plant coater of Agro(Co). The water repellency of wax emulsion coated paper was good as fruit bagging paper.

  • PDF

Improving Bagging Predictors

  • Kim, Hyun-Joong;Chung, Dong-Jun
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2005년도 추계 학술발표회 논문집
    • /
    • pp.141-146
    • /
    • 2005
  • Ensemble method has been known as one of the most powerful classification tools that can improve prediction accuracy. Ensemble method also has been understood as ‘perturb and combine’ strategy. Many studies have tried to develop ensemble methods by improving perturbation. In this paper, we propose two new ensemble methods that improve combining, based on the idea of pattern matching. In the experiment with simulation data and with real dataset, the proposed ensemble methods peformed better than bagging. The proposed ensemble methods give the most accurate prediction when the pruned tree was used as the base learner.

  • PDF

다구찌 디자인을 이용한 데이터 퓨전 및 군집분석 분류 성능 비교 (Comparison Study for Data Fusion and Clustering Classification Performances)

  • 신형원;손소영
    • 한국경영과학회:학술대회논문집
    • /
    • 대한산업공학회/한국경영과학회 2000년도 춘계공동학술대회 논문집
    • /
    • pp.601-604
    • /
    • 2000
  • In this paper, we compare the classification performance of both data fusion and clustering algorithms (Data Bagging, Variable Selection Bagging, Parameter Combining, Clustering) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are (1) correlation among input variables (2) variance of observation (3) training data size and (4) input-output function. Since the relationship between input & output is not typically known, we use Taguchi design to improve the practicality of our study results by letting it as a noise factor. Experimental study results indicate the following: Clustering based logistic regression turns out to provide the highest classification accuracy when input variables are weakly correlated and the variance of data is high. When there is high correlation among input variables, variable bagging performs better than logistic regression. When there is strong correlation among input variables and high variance between observations, bagging appears to be marginally better than logistic regression but was not significant.

  • PDF

Bagging deep convolutional autoencoders trained with a mixture of real data and GAN-generated data

  • Hu, Cong;Wu, Xiao-Jun;Shu, Zhen-Qiu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권11호
    • /
    • pp.5427-5445
    • /
    • 2019
  • While deep neural networks have achieved remarkable performance in representation learning, a huge amount of labeled training data are usually required by supervised deep models such as convolutional neural networks. In this paper, we propose a new representation learning method, namely generative adversarial networks (GAN) based bagging deep convolutional autoencoders (GAN-BDCAE), which can map data to diverse hierarchical representations in an unsupervised fashion. To boost the size of training data, to train deep model and to aggregate diverse learning machines are the three principal avenues towards increasing the capabilities of representation learning of neural networks. We focus on combining those three techniques. To this aim, we adopt GAN for realistic unlabeled sample generation and bagging deep convolutional autoencoders (BDCAE) for robust feature learning. The proposed method improves the discriminative ability of learned feature embedding for solving subsequent pattern recognition problems. We evaluate our approach on three standard benchmarks and demonstrate the superiority of the proposed method compared to traditional unsupervised learning methods.

Xgboosting 기법을 이용한 실내 위치 측위 기법 (Indoor positioning system using Xgboosting)

  • 황치곤;윤창표;김대진
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2021년도 추계학술대회
    • /
    • pp.492-494
    • /
    • 2021
  • 기계학습에서 분류를 위한 기법으로 의사결정트리 기법을 이용한다. 그러나 의사결정트리는 과적합의 문제로 성능이 저하되는 문제가 있다. 이러한 문제를 해결하기 위해 여러 개의 부트스트랩을 생성하여 각 자료를 모델링하여 학습하는 Bagging기법, 샘플링한 데이터를 모델링하여 가중치를 조정하여 과적합을 감소시키는 Boosting과 같은 기법으로 이를 해결할 수 있다. 또한, 최근에 Xgboost 기법이 등장하였다. 이에 본 논문에서는 실내 측위를 위한 wifi 신호 데이터를 수집하여 기존 방식과 Xgboost에 적용하고, 이를 통한 성능평가를 수행한다.

  • PDF

Anomaly-Based Network Intrusion Detection: An Approach Using Ensemble-Based Machine Learning Algorithm

  • Kashif Gul Chachar;Syed Nadeem Ahsan
    • International Journal of Computer Science & Network Security
    • /
    • 제24권1호
    • /
    • pp.107-118
    • /
    • 2024
  • With the seamless growth of the technology, network usage requirements are expanding day by day. The majority of electronic devices are capable of communication, which strongly requires a secure and reliable network. Network-based intrusion detection systems (NIDS) is a new method for preventing and alerting computers and networks from attacks. Machine Learning is an emerging field that provides a variety of ways to implement effective network intrusion detection systems (NIDS). Bagging and Boosting are two ensemble ML techniques, renowned for better performance in the learning and classification process. In this paper, the study provides a detailed literature review of the past work done and proposed a novel ensemble approach to develop a NIDS system based on the voting method using bagging and boosting ensemble techniques. The test results demonstrate that the ensemble of bagging and boosting through voting exhibits the highest classification accuracy of 99.98% and a minimum false positive rate (FPR) on both datasets. Although the model building time is average which can be a tradeoff by processor speed.

데이터 마이닝에서 배깅, 부스팅, SVM 분류 알고리즘 비교 분석 (An Empirical Comparison of Bagging, Boosting and Support Vector Machine Classifiers in Data Mining)

  • 이영섭;오현정;김미경
    • 응용통계연구
    • /
    • 제18권2호
    • /
    • pp.343-354
    • /
    • 2005
  • 데이터 마이닝에서 데이터를 효율적으로 분류하고자 할 때 많이 사용하고 있는 알고리즘을 실제 자료에 적용시켜 분류성능을 비교하였다. 분류자 생성기법으로는 의사결정나무기법 중의 하나인 CART, 배깅과 부스팅 알고리즘을 CART 모형에 결합한 분류자, 그리고 SVM 분류자를 비교하였다. CART는 결과 해석이 쉬운 장점을 가지고 있지만 데이터에 따라 생성된 분류자가 다양하여 불안정하다는 단점을 가지고 있다. 따라서 이러한 CART의 단점을 보완한 배깅 또는 부스팅 알고리즘과의 결합을 통해 분류자를 생성하고 그 성능에 대해 평가하였다. 또한 최근 들어 분류성능을 인정받고 있는 SVM의 분류성능과도 비교?평가하였다. 각 기법에 의한 분류 결과를 가지고 의사결정나무를 형성하여 자료가 가지는 데이터의 특성에 따른 분류 성능을 알아보았다. 그 결과 데이터의 결측치가 없고 관측값의 수가 적은 경우는 SVM의 분류성능이 뛰어남을 알 수 있었고, 관측값의 수가 많을 때에는 부스팅 알고리즘의 분류성능이 뛰어났으며, 데이터의 결측치가 존재하는 경우는 배깅의 분류성능이 뛰어남을 알 수 있었다.

다중 분류기 시스템을 이용한 자동 문서 분류 (Automatic Document Classification Using Multiple Classifier Systems)

  • 김인철
    • 정보처리학회논문지B
    • /
    • 제11B권5호
    • /
    • pp.545-554
    • /
    • 2004
  • 단일 분류기에 비해 높은 분류성능을 얻기 위해 다수의 분류기들을 결합하여 사용하는 방법은 폭넓게 이용되어 온 기술이다. 하나의 다중 분류기 시스템을 구성하는 일은 다음 두 가지 문제들을 가지고 있다. 첫째는 어떻게 기반 분류기들을 생성하느냐 하는 것이고 둘째는 이들의 예측결과를 어떻게 결합하느냐 하는 것이다. 본 논문에서는 Bagging, Boosting, Stacking 등 기존의 대표적인 다중 분류기 시스템들의 특징을 살펴보고, 문서 분류를 위한 새로운 다중 분류기 시스템들인 Stacked Bagging, Stacked Boosting, Bagged Stacking, Boosted Stacking들을 제안한다. 이들은 Bagging, Boosting, Stacking과 같은 기존 다중 분류기 시스템들의 장점들을 결합한 일종의 혼합형 다중 분류기 시스템들이다. 본 논문에서는 제안된 다중 분류기 시스템들의 성능을 평가하기 위해 MEDLINE, 유즈넷 뉴스, 웹 문서 등의 문서집합을 이용한 문서 분류 실험들을 전개하였다. 그리고 이러한 실험결과를 통해 제안한 혼합형 다중 분류기 시스템들은 전반적으로 기존 시스템들보다 우수한 성능을 보이는 것으로 나타났다.