Browse > Article

A study for improving data mining methods for continuous response variables  

Choi, Jin-Soo (Department Statistics, Korea University)
Lee, Seok-Hyung (Department Statistics, Korea University)
Cho, Hyung-Jun (Department Statistics, Korea University)
Publication Information
Journal of the Korean Data and Information Science Society / v.21, no.5, 2010 , pp. 917-926 More about this Journal
Abstract
It is known that bagging and boosting techniques improve the performance in classification problem. A number of researchers have proved the high performance of bagging and boosting through experiments for categorical response but not for continuous response. We study whether bagging and boosting improve data mining methods for continuous responses such as linear regression, decision tree, neural network through bagging and boosting. The analysis of eight real data sets prove the high performance of bagging and boosting empirically.
Keywords
Bagging; boosting; decision tree; ensemble;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123-140.
2 Cortez, P. and Morais, A. (2007). A data mining approach to predict forest fires using meteorological data, In Neves, J.M. and Santos, M.F. and Machado J.M.. New Trends in Artificial Intelligence: Proceedings of the 13th EPIA 2007 - Portuguese Conference on Artificial Intelligence, December, Guimaraes, Portugal, 512-523.
3 석경하, 류태욱 (2002). The efficiency of boosting on SVM. <한국데이터정보과학회지>, 13, 55-64   과학기술학회마을
4 이상복 (2001). 데이터마이닝기법상에서 적합된 예측모형의 평가 - 4개 분류예측모형의 오분류율 및 훈련시간 비교 평가 중심으로. <한국데이터정보과학회지>, 12, 113-124
5 조영준, 이용구 (2004). 단층퍼셉트론 모형에서 초기치 최적화 방법에 관한 연구. <한국데이터정보과학회지>, 15, 331-337   과학기술학회마을
6 Loh, W. Y. (2002). Regression trees with unbiased variable selection and interaction detection. Statistica Sinica, 12, 361-386.
7 박희창, 조광현 (2004). 의사결정나무기법에 의한 환경조사 모형화. <한국데이터정보과학회지>, 15, 759-771   과학기술학회마을
8 Quinlan, J. R. (1993). C4.5: Programs for Machine Learning, San Mateo, CA Morgan Kaufmann.
9 Shestha, D. L. and Solomatine, D. P. (2004). AdaBoost.RT: A boosting algorithm for regression problems, International Joint Conference on Neural Networks, Budapest, Hungary.
10 I-Cheng, Y. (1999). Design of high-performance concrete mixture using neural networks and nonlinear programming. Journal of Computing in Civil Engineering, 13, 36-42.   DOI   ScienceOn
11 Dietterich, T. G.(2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization. Machine Learning, 40, 139-158.   DOI   ScienceOn
12 Efron, B and Tibshirani, R. J (1994). Nonparametric regression and generalized linear models, New York, Chapman and Hall.
13 Berndt, E. (1991). The practice of economics: Classic and contemporary, reading, Mass, Addison-Wesley.
14 Ein-Dor, P. and Feldmesser, J. (1987). Attributes of the performance of central processing units: A relative performance prediction model. Communications of the ACM, 30, 308-317.   DOI   ScienceOn
15 Freund, Y. and Schapire, R. E. (1996). Experiments with a new boosting algorithm. Machine Learning, Proceedings of the Thirteenth International Conference 148-156. Morgan Kauffman, San Francisco.
16 Harrison, D. and Rubinfeld, D. L. (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics & Management, 5, 81-102.   DOI   ScienceOn
17 Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and regression trees, New York, Chapman and Hall.