Browse > Article
http://dx.doi.org/10.7465/jkdi.2014.25.6.1371

Comparison of ensemble pruning methods using Lasso-bagging and WAVE-bagging  

Kwak, Seungwoo (Department of Applied Statistics, Yonsei University)
Kim, Hyunjoong (Department of Applied Statistics, Yonsei University)
Publication Information
Journal of the Korean Data and Information Science Society / v.25, no.6, 2014 , pp. 1371-1383 More about this Journal
Abstract
Classification ensemble technique is a method to combine diverse classifiers to enhance the accuracy of the classification. It is known that an ensemble method is successful when the classifiers that participate in the ensemble are accurate and diverse. However, it is common that an ensemble includes less accurate and similar classifiers as well as accurate and diverse ones. Ensemble pruning method is developed to construct an ensemble of classifiers by choosing accurate and diverse classifiers only. In this article, we proposed an ensemble pruning method called WAVE-bagging. We also compared the results of WAVE-bagging with that of the existing pruning method called Lasso-bagging. We showed that WAVE-bagging method performed better than Lasso-bagging by the extensive empirical comparison using 26 real dataset.
Keywords
Bagging; classification; data mining; ensemble; pruning;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Chen, K. and Jin, Y. (2010). An ensemble learning algorithm based on lasso selection. IEEE International Conference on Intelligent Computing and Intelligent Systems (ICIS), 1, 617-620.
2 Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and regression trees, Chapman and Hall, New York.
3 Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123-140.
4 Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.   DOI   ScienceOn
5 Dietterich, T. G. (2000). Ensemble methods in machine learning, Springer, Berlin.
6 Freund, Y. and Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119-139.   DOI   ScienceOn
7 Heinz, G., Peterson, L. J., Johnson, R. W. and Kerk, C. J. (2003). Exploring relationships in body dimensions. Journal of Statistics Education, 11, http://www.amstat.org/publications/jse/v11n2/datasets.heinz.html.
8 Kim, A., Kim, J. and Kim, H. (2012). The guideline for choosing the right-size of tree for boosting algorithm. Journal of the Korean Data & Information Science Society, 23, 949-959.   과학기술학회마을   DOI   ScienceOn
9 Kim, H., Kim, H., Moon, H. and Ahn, H. (2011). A weight-adjusted voting algorithm for ensemble of classifiers. Journal of the Korean Statistical Society, 40, 437-449.   DOI   ScienceOn
10 Kuncheva, L. (2004). Combining pattern classifiers: Methods and algorithms, Wiley, New Jersey.
11 Kuncheva, L. (2005). Diversity in multiple classifier systems. Information Fusion, 6, 3-4.   DOI   ScienceOn
12 Loh, W.-Y. (2009). Improving the precision of classification trees. The Annals of Applied Statistics, 3, 1710-1737.   DOI
13 Rokach, L. (2009). Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Computational Statistics and Data Analysis, 53, 4046-4072.   DOI   ScienceOn
14 Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B, 58, 267-288.
15 Zhou, Z. H. and Tang, W. (2003). Selective ensemble of decision trees. Lecture Notes in Computer Science, 2639, 476-483.
16 Asuncion, A. and Newman, D. J. (2007). UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences, http://archive.ics.uci.edu/ml.