[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5351/CKSS.2010.17.4.561

A Study for Improving the Performance of Data Mining Using Ensemble Techniques

Jung, Yon-Hae (Department of Statistics, Korea University)
Eo, Soo-Heang (Department of Statistics, Korea University)
Moon, Ho-Seok (Department of Computer and Information, Korea Military Academy)
Cho, Hyung-Jun (Department of Statistics, Korea University)

Publication Information

Communications for Statistical Applications and Methods / v.17, no.4, 2010 , pp. 561-574 More about this Journal

Abstract

We studied the performance of 8 data mining algorithms including decision trees, logistic regression, LDA, QDA, Neral network, and SVM and their combinations of 2 ensemble techniques, bagging and boosting. In this study, we utilized 13 data sets with binary responses. Sensitivity, Specificity and missclassificate error were used as criteria for comparison.

Keywords

Ensemble; bagging; boosting; data mining;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Schapire, R. E. and Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions, Machine Learning, 37, 297-336. DOI
2	Valiant, L. G. (1984). A theory of the learnable, Communication of the ACM, 27, 1134-1142. DOI ScienceOn
3	Vapnik, V. (1979). Estimation of Dependences Based on Empirical Data, Nauka, Moscow.
4	Wolpert, D. (1992). Stacked generalization, Neural Network, 5, 241-259. DOI ScienceOn
5	Frank, A. and Asuncion, A. (2010). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
6	Freund, Y. (1995). Boosting a weak learning algorithm by majority, Information and Computation, 121, 256-285. DOI ScienceOn
7	Freund, Y. and Schapire, R. (1996). Experiments with a new boosting algorithm, Proceedings of the Thirteenth International Conference on Machine Learning, 148-156.
8	Kass, G. V. (1980). An Exploratory Technique for Investigating Large Quantities of Categorical Data. Journal of the Royal Statistical Society. Series C (Applied Statistics), 29, 119-127.
9	Kearns, M. and Valiant, L. G. (1994). Cryptographic limitations on learning Boolean formulae and finite automata, Journal of the Association for Computing Machinery, 41, 67-95. DOI ScienceOn
10	Kim, H. J. and Loh, W. Y. (2001). Classification trees with unbiased multiway splits, Journal of the American Statistical Association, 96, 598-604.
11	Loh, W. Y. and Shih, Y. S. (1997). Split selection method for classification trees, Statistica Sinica, 7, 815-840.
12	Opitz, D. and Maclin, R. (1999). Popular ensemble methods: An empirical study, Journal of the Artificial Intelligence Research, 11, 169-198.
13	Schapire, R. E. (1990). The strength of weak learnability, Machine Learning, 5, 197-227.
14	Perrone, M. (1993). Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization, Doctoral dissertation, Department of Physics, Brown University.
15	Quinlan, J. R. (1992). C4.5 : Programming with Machine Learning, Morgan Kaufmann Publishers.
16	Quinlan, J. R. (1996). Bagging, boosting, and C4.5, Proceedings of the Fourteenth National Conference on Machine Learning, 725-730.
17	김규곤 (2003). 데이터 마이닝에서 분류방법에 관한 연구, Journal of the Korean Data Analysis Society, 5, 101-112.
18	김기영, 전명식 (1994). <다변량 통계자료분석>, 자유아카데미, 서울.
19	이영섭, 오현정, 김미경 (2005). 데이터 마이닝에서 배깅, 부스팅, SVM 분류 알고리즘 비교 분석, <응용통계연구>, 18, 343-354. 과학기술학회마을 DOI ScienceOn
20	허면회, 서혜선 (2001). , 자유아카데미, 서울.
21	Bauer, E. and Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, Boosting and variants, Machine Learning, 36, 105-139. DOI
22	Breiman, L. (1996). Bagging predictors, Machine Learning, 26, 123-140.
23	Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees, Chapman & Hall, New York.
24	Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap, Chapman & Hall, New York.
25	Clemen, R. (1989). Combining forecasts: A review and annotated bibliography, Journal of Forecasting, 5, 559-583. DOI ScienceOn
26	Drucker, H. and Cortes, C. (1996). Boosting decision trees, Neural Information Processing Systems, 8, 470-485.
27	Druker, H., Schapire, R. and Simard, P. (1993). Boosting performance in neural networks, International Journal of Pattern Recognition and Artificial Intelligence, 7, 705-719. DOI ScienceOn

1	Enhancing of Red Tide Blooms Prediction using Ensemble Train / [Park, Sun;Jeong, Min-A;Lee, Seong-Ro;] / Journal of the Institute of Electronics Engineers of Korea SP
2	The guideline for choosing the right-size of tree for boosting algorithm / [Kim, Ah-Hyoun;Kim, Ji-Hyun;Kim, Hyun-Joong;] / Journal of the Korean Data and Information Science Society
3	A Study on the Data Mining Preprocessing Tool For Efficient Database Marketing / [Lee, Jun-Seok;] / Journal of Digital Convergence

KSCI

A Study for Improving the Performance of Data Mining Using Ensemble Techniques 앙상블기법을 이용한 다양한 데이터마이닝 성능향상 연구

A Study for Improving the Performance of Data Mining Using Ensemble Techniques