Browse > Article

Pattern Selection Using the Bias and Variance of Ensemble  

Shin, Hyunjung (Department of Industrial Engineering, Seoul National University)
Cho, Sungzoon (Department of Industrial Engineering, Seoul National University)
Publication Information
Journal of Korean Institute of Industrial Engineers / v.28, no.1, 2002 , pp. 112-127 More about this Journal
Abstract
A useful pattern is a pattern that contributes much to learning. For a classification problem those patterns near the class boundary surfaces carry more information to the classifier. For a regression problem the ones near the estimated surface carry more information. In both cases, the usefulness is defined only for those patterns either without error or with negligible error. Using only the useful patterns gives several benefits. First, computational complexity in memory and time for learning is decreased. Second, overfitting is avoided even when the learner is over-sized. Third, learning results in more stable learners. In this paper, we propose a pattern 'utility index' that measures the utility of an individual pattern. The utility index is based on the bias and variance of a pattern trained by a network ensemble. In classification, the pattern with a low bias and a high variance gets a high score. In regression, on the other hand, the one with a low bias and a low variance gets a high score. Based on the distribution of the utility index, the original training set is divided into a high-score group and a low-score group. Only the high-score group is then used for training. The proposed method is tested on synthetic and real-world benchmark datasets. The proposed approach gives a better or at least similar performance.
Keywords
pattern selection; ensemble network;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Hearst. M. A. (1998), Support Vector Machines, IEEE INTEL/LIGENT SYSTEM. 167-179
2 Drucker, E. (1999), Boosting Using Neural Networks, In Amanda J. C. Sharkey (Eds), Combining Artificial Neural Nets :Ensemble and Modular Learning, Springer-Verlag, 51-77
3 Gunn, S. (1998), Support Vector Machines for Classification and Regression, ISIS Technical Report
4 Hara, K. and Nakayama, K. (2000), A Training Method with Small Computation for Classification, Proceedings of the IEEE-INNS-ENNS International Joint Conference, 3, 543-548
5 Kwok, J. T. (1999), Moderating the Optputs of Support Vector Machine Classifiers, IEEE Transactions on Neural Networks, 10(5), 1018-1031   DOI   ScienceOn
6 Lee, C. and Landgrebe, D. A. (1997), Decision Boundary Feature Extraction for Neural Networks, IEEE Transactions on Neural Networks, 8(1), 75-83   DOI   ScienceOn
7 Leisch, F., Jain, L. C. and Hornik, K. (1998), Cross-Validation with Active Pattern Selection for Neural-Network Classifiers, IEEE Transactions on Neural Networks, 9, 35-41
8 Mitchell, T. M. (1997), Machine Learning, McGRAW-HILL International Editions (Computer Science Series), 81-127
9 Perrone, M. P. (1993a), Improving Regression Estimation: Averaging Methods for Variance Reduction with &tension to General Convex Measure Optimization, PhD Thesis, Department of Physics, Brown University, Providence, RI
10 Tumer, K. and Ghosh, J. (1996), Error. Correlation and Error Reduction in Ensemble Classifiers, Connection Science, 8, 385-404
11 Haykin, S. (1999), Neural Networks: A Comprehensive Foundation, Macmilan, New York, 351-390
12 Bishop, C. M. (1995), Neural Networks For Pattern Recognition, Oxford University Press, New York, 386-439
13 Breiman, L. (1996b), Bias, Variance, and Arcing Classifiers, Technical Report 460, Department of Statistics, University of California, Berkeley, CA.
14 Drucker, E. (1997), Improving Regressors Using Boosting Techniques, The Fourteenth International Conference on Machine Learning, 107-115
15 Perrone, M. P. and Cooper, L. N. (1993b), When Networks Disagree: Ensemble Methods for Hybrid Neural Networks, Artificial Neural Networks for Speech and Vision, Chapman and Hall, London
16 Krogh, A. and Vedelsby, J. (1995), Neural Network Ensembles, Cross Validation, and Active Learning,.. In: Tesauro, G., Touretzky, D. S. and Leen. T. K. (Eds), Advances in Neural Information Processing Systems 7, Cambridge, MA: MIT Press, 231-238
17 Breiman, L. (1996a), Bagging Predictors, Machine Learning, 24, 123-140
18 Zhang, B. T. (1994), Accelerated Learning by Active Example Selection, Incremental Journal of Neural Systems, 5(1),67-75   DOI   PUBMED   ScienceOn
19 Cachin, C. (1994), Pedagogical Pattern Selection Strategies, Neural Networks, 7(1), 175-181   DOI   ScienceOn
20 Burges, C.J.C (1998), A Tutorial on Support Vector Machines for Pattern Recognition , Data Mining and Knowledge Discovery, 2, 121-167   DOI   ScienceOn
21 Vincent, P. and Bengio, Y. (2000), A Neural Support Vector Network Architecture with Adaptive Kernels, IEEE Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, 187-192
22 Plutowski, M. and White, H. (1993), Selecting Concise Training Sets from Clean Data, IEEE Transactions on Neural Networks, 4(2), 305-318   DOI   ScienceOn
23 Cho, S. and Wong, P.M. (1999), Data Selection based on Bayesian Error Bar, The Six International Conference on Neural Information Processing, 1, 418-422
24 Sharkey, A. J. C. (1996), On Combining Artificial Neural Nets, Connection Science, 8, 299-313
25 Sharkey, A. J. C. (1997), Combining Diverse NeuraI Nets, The Knowledge Engineering Review, 12(3), 231-247   DOI   ScienceOn
26 Freund, Y., Schapire, R. E. (1997), A Decision-Theoretic Generalization of On-line Learning and an Application to Boosting, Journal of Computer and System Sciences, 55(1), 119-139   DOI   ScienceOn
27 Robel, A. (1994), The Dynamic Pattern Selection Algorithm: Effective Training and Controlled Generalization of Back Propagation Neural Networks, Technische Univ. Berlin, Germany, Technical Report
28 Mackay, D. J. C. (1992), Bayesian Interpolation, Neural Computation, 4, 415-447
29 Foody, G. M. (1999), The Significance of Border Training Patterns in Classification by a Feedforward Neural Network Using Back Propagation Learning, International Journal of Remote Sensing, 20(18), 3549-3562   DOI   ScienceOn
30 Parmanto, B., Munro, P. W. and Doyle, H. R.(1996), Reducing Variance of Committee Prediction with Resampling Techniques, Connetion Science, 8, 405-425
31 Plutowski, M. (1994), Selecting Training Examplars for Neural Network Learning, Ph.D. Dissertation, Univ. California, San Diego
32 Qu, D., Wong, P. M., Cho, S. and Gedeon, T. D. (2001), A Hybrid Intelligent System for Improved Petrophysical Predictions, to appear in ICONIP proceedings
33 Zhang, B. T. (1993), Learning by incremental Selection of Critical Examples, Arbeitspaper der GMD, No. 735, German National Research Center for Computer Science (GMD), St Augustin/Bonn
34 UCI Repository Of Machine Learning Databases, http://www.ics.uci.edu/~mlearn