Browse > Article

Improving Generalization Performance of Neural Networks using Natural Pruning and Bayesian Selection  

이현진 (한국싸이버대학교 컴퓨터정보통신학부)
박혜영 (일본이화학연구소 뇌수리연구팀)
이일병 (연세대학교 컴퓨터과학과)
Abstract
The objective of a neural network design and model selection is to construct an optimal network with a good generalization performance. However, training data include noises, and the number of training data is not sufficient, which results in the difference between the true probability distribution and the empirical one. The difference makes the teaming parameters to over-fit only to training data and to deviate from the true distribution of data, which is called the overfitting phenomenon. The overfilled neural network shows good approximations for the training data, but gives bad predictions to untrained new data. As the complexity of the neural network increases, this overfitting phenomenon also becomes more severe. In this paper, by taking statistical viewpoint, we proposed an integrative process for neural network design and model selection method in order to improve generalization performance. At first, by using the natural gradient learning with adaptive regularization, we try to obtain optimal parameters that are not overfilled to training data with fast convergence. By adopting the natural pruning to the obtained optimal parameters, we generate several candidates of network model with different sizes. Finally, we select an optimal model among candidate models based on the Bayesian Information Criteria. Through the computer simulation on benchmark problems, we confirm the generalization and structure optimization performance of the proposed integrative process of teaming and model selection.
Keywords
Adaptive Regularization; Natural Pruning; Generalization; Neural Networks; Design; Model Selection;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Murphy, P. M., Aha, D. W., UCI Repository of Machine Learning Databases[Machine Readable Data Repository], Univ. of California, Dept of Information and Computer Science, 1996
2 Laar, P. V. D., Heskes, T., Pruning Using Parameter and Neuronal Metrics, Neural Computation, 11, 977-993, 1999   DOI   ScienceOn
3 Hansen, L. K., Pedersen, M. W., Controlled Growth of Cascade Correlation Nets, Proceedings of International Conference on Neural Networks, 797-800, 1994
4 Larsen, J., Svarer, C., Andersen, L. N., Hansen, L. K., Adaptive Regularization in Neural Network Modeling, Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science, 1524, Germany: Springer-Verlag, 113-132, 1998
5 Hintz-Madsen, M., Hansen, L. K., Larsen, J., Pedersen, M. W., Larsen, M., 'Neural classifier construction using regularization, pruning and test error estimation,' Neural Networks, 11, 1659-1670, 1998   DOI   ScienceOn
6 Lee, H., Jee, T., Park, H., Lee, Y., A Hybrid Approach to Complexity Optimization of Neutral Networks, Proceedings of International Conference on Neural Information Processing, 3, 1455-1460, 2001
7 박혜영, Efficient On-line Learning Algorithms Based on Information Geometry for Stochastic Neural Networks, 연세대학교 박사학위 청구 논문, 2000
8 Qi, M., Zhang, G. P., An investigation of model selection criteria for neural network time series forecasting, European Journal of Operational Research, 132, 666-680, 2001   DOI   ScienceOn
9 Krogh, A., Hertz, J. A., A Simple Weight Decay Can Improve Generalization, Advances in Neural Information Processing Systems, 4, 950-957, 1992
10 Pedersen, M. W., Hansen, L. K., Larsen, J., Pruning with generalization based weight saliencies: (OBD, (OBS, Advances in Neural Information Processing Systems, 8, 521-527, 1996
11 Amari, S., Natural gradient works efficiently in learning, Neural Computation, 10(2), 251-276, 1998   DOI   ScienceOn
12 Amari, S., Park, H., Fukumizu, K., Adaptive method of realizing natural gradient learning for multilayer perceptrons, Neural Computation,12(6), 1399-1409, 2000   DOI
13 Andersen, T., Rimer, M., Martinez, T., Optimal Artificial Neural Network Architecture Selection for Baggin. Proceedings of International Joint Conference on Neural Networks, 2, 790 - 795, 2001
14 Bishop, C. M., Neural Networks for Pattern Recognition, Oxford University Press, 1995
15 Park, H., Practical Consideration on Generalization Property of Natural Gradient Learning, Lecture Notes in Computer Science, 2084, 402-409, 2001
16 Heskes, T., On Natural Learning and Pruning in Multilayered Perceptrons, Neural Computation, 12, 1037-1057, 2000   DOI   ScienceOn
17 Haykin, S., Neural Networks; A Comprehensive Foundation, Prentice-Hall :Second Edition, Inc., 1999
18 Reed, R. D., Marks, R. J., Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks, MIT Press, 1999
19 Ripley, B., Pattern Recognition and Neural Networks, Cambridge: Cambridge University Press, 1996