Pruning the Boosting Ensemble of Decision Trees

Yoon, Young-Joo;Song, Moon-Sup;

doi:10.5351/CKSS.2006.13.2.449

Communications for Statistical Applications and Methods

Volume 13 Issue 2
/
Pages.449-466
/
2006
/
2287-7843(pISSN)
/
2383-4757(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

Pruning the Boosting Ensemble of Decision Trees

Yoon, Young-Joo (Department of Statistics, Seoul National University) ;
Song, Moon-Sup (Department of Statistics, Seoul National University)

Published : 2006.08.31

https://doi.org/10.5351/CKSS.2006.13.2.449 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

We propose to use variable selection methods based on penalized regression for pruning decision tree ensembles. Pruning methods based on LASSO and SCAD are compared with the cluster pruning method. Comparative studies are performed on some artificial datasets and real datasets. According to the results of comparative studies, the proposed methods based on penalized regression reduce the size of boosting ensembles without decreasing accuracy significantly and have better performance than the cluster pruning method. In terms of classification noise, the proposed pruning methods can mitigate the weakness of AdaBoost to some degree.

Keywords

References

Breiman, L. (1996). Bagging predictors. Machine Learning, Vol. 24, 123-140
Breiman, L. (1998). Arcing classifiers (with discussion). Annals of Statistics, Vol. 26, 801-849 https://doi.org/10.1214/aos/1024691079
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees, Chapman and Hall, New York
Dietterich, T.G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning, Vol. 40, 139-157 https://doi.org/10.1023/A:1007607513941
Fan, J, and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, Vol. 96, 1348-1360 https://doi.org/10.1198/016214501753382273
Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of online learning and application to boosting. Journal of Computer and System Science, Vol. 55, 119-139 https://doi.org/10.1006/jcss.1997.1504
Friedman, J. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, Vol. 29, 1189-1232
Hastie, T., Tibshirani, R. and Friedman, J.H. (2001). Elements of Statistical Learning. Springer-Verlag, New York
Heskes, T. (1997). Balancing between bagging and bumping, In Mozer, M., Jordan, M., and Petsche, T. editors. Advances in Neural Information Processing, Morgan Kaufmann
Lazarevic, A. and Obradovic, Z. (2001). The effective pruning of neural network ensembles. Proceedings of 2001 IEEE/INNS International Joint Conierence on Neural Networks, 796-801
Margineantu, D.D. and Dietterich, T.G. (1997). Pruning adaptive boosting. Proceedings of the 14th International Conference in Machine Learning, 211-218
Mason, L., Baxter, J., Bartlett, P.L. and Frean, M. (2000). Functional gradient techniques for combining hypotheses, In A. J. Smola, P. L. Bartlett, B. Scholkopf and D. Schuurmans, editors. Advances in Large Margin Classifiers, Cambridge: MIT press
Merz, C.J. and Murphy, P.M. (1998). DCI Repository of Machine Learning database. Available at http://www.ics.uci.edu/-mlearn/MLRepository.html
Quinlan, J.R. (1993). C4.5 : Programs for Machine Learning, Morgan Kaufmann, San Maeto, CA
Quinlan, J.R. (1996). Bagging, boosting, and C4.5. Proceeding of 13th National Conference on Artificial Intelligence, 725-730
Rosset, S., Zhu, J. and Hastie, T. (2004). Boosting as a regularized path to a maximum margin classifier. Journal of Machine Learning Research, Vol. 5, 941-973
Tamon, C. and Xiang, J. (2000). On the boosting pruning problem. Proceedings of 11th European Conference on Machine Learning, Lecture Notes in Computer Science, Vol. 1810, 404-412
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society B, Vol. 58, 267-288
Tibshirani, R. and Knight, K. (1999). Model selection and inference by bootstrap 'bumping'. Journal of Computational and Graphical Statistics, Vol. 8, 671-686 https://doi.org/10.2307/1390820

Communications for Statistical Applications and Methods

Pruning the Boosting Ensemble of Decision Trees

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)