Browse > Article
http://dx.doi.org/10.7737/JKORMS.2013.38.1.139

Tuning the Architecture of Neural Networks for Multi-Class Classification  

Jeong, Chulwoo (한국국방연구원)
Min, Jae H. (서강대학교 경영전문대학원)
Publication Information
Abstract
The purpose of this study is to claim the validity of tuning the architecture of neural network models for multi-class classification. A neural network model for multi-class classification is basically constructed by building a series of neural network models for binary classification. Building a neural network model, we are required to set the values of parameters such as number of hidden nodes and weight decay parameter in advance, which draws special attention as the performance of the model can be quite different by the values of the parameters. For better performance of the model, it is absolutely necessary to have a prior process of tuning the parameters every time the neural network model is built. Nonetheless, previous studies have not mentioned the necessity of the tuning process or proved its validity. In this study, we claim that we should tune the parameters every time we build the neural network model for multi-class classification. Through empirical analysis using wine data, we show that the performance of the model with the tuned parameters is superior to those of untuned models.
Keywords
Multi-class Classification; Neural Networks; Model Architecture; Tuning Method;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Zhao, H., A. Sinha, and W. Ge, "Effects of feature construction on classification performance : An empirical study in bank failure prediction," Expert Systems with Applications, Vol.36, No.2(2009), pp.2633-2644.   DOI   ScienceOn
2 김다윗, 민성환, 한인구, "신경망 분리모형과 사례 기반추론을 이용한 기업 신용 평가", Journal of Information Technology Applications and Management , 제14권, 제2호(2007), pp.151-168.
3 Altman, E.I., G. Marco, and F. Varetto, "Corporate distress diagnosis comparisons using discriminant analysis and neural networks," Journal of Banking and Finance, Vol.18, No.3 (1994), pp.505-529.   DOI   ScienceOn
4 Bartlett, P.L., "For valid generalization, the size of the weights is more important than the size of the network," in M.C. Mozer, M.I. Jordan and T. Petsche (Eds.), Advances in Neural Information Processing Systems, Vol.9, The MIT Press, Cambridge, MA, 1997.
5 Berry, M.J.A. and G.S. Linoff, Data Mining Techniques : For Marketing, Sales, and Cus tomer Relationship Management, Wiley, Indiana, 2004.
6 Bishop, C.M., Neural networks for pattern recognition, Oxford University Press, New York, 1995.
7 Cheng, B. and D.M. Titterington, "Neural Networks : A Review from a Statistical Perspective," Statistical Science, Vol.9, No.1(1994), pp.2-30.   DOI   ScienceOn
8 Cortez, P., A. Cerdeira, F. Almeida, T. Matos, and J. Reis, "Modeling wine preferences by data mining from physicochemical properties," Decision Support Systems, Vol.47, No.4 (2009), pp.547-553.   DOI   ScienceOn
9 Cybenko, G., "Approximation by superpositions of a sigmoid function," Mathematics of Control, Signals, and Systems, Vol.2, No.4 (1989), pp.303-314.   DOI   ScienceOn
10 De Veaux, R.D., J. Schumi, J. Schweinsberg, and L.H. Ungar, "Prediction intervals for neural networks via nonlinear regression," American Statistical Association and the American Society for Quality, Vol.40, No.44(1998), pp.273- 282.
11 De Villiers, J. and E. Barnard, "Backpropagation neural nets with one and two hidden layers," IEEE Transactions on Neural Networks, Vol.4, No.1(1993), pp.136-141.   DOI   ScienceOn
12 Geman, S., E. Bienenstock, and R. Doursat, "Neural networks and the bias/variance dilemma," IEEE Neural Computation, Vol.4, No.1(1992), pp.1-58.   DOI
13 Hinton, G.E., "Learning translation invariant recognition in massively parallel networks," In J.W. de Bakker, A.J. Nijman and P.C. Treleaven (Eds.), Proceedings PARLE Conference on Parallel Architectures and Languages Europe, Springer-Verlag, Berlin, 1987.
14 Hush, D.R. and B.G. Horne, "Progress in Supervised Neural Networks," IEEE Signal Processing Magazine, Vol.10, No.1(1993), pp.8-39.
15 Jeong, C., J.H. Min, and M.S. Kim, "A tuning method for the architecture of neural network models incorporating GAM and GA as applied to bankruptcy prediction," Expert Systems with Applications, Vol.39, No.3(2012), pp.3650-3658.   DOI   ScienceOn
16 Jo, H. and I. Han, "Integration of case-based forecasting, neural network, and discriminant analysis for bankruptcy prediction," Expert Systems with Applications, Vol.11, No.4(1996), pp.415-422.   DOI   ScienceOn
17 Kim, J., H.R. Weistroffer, and R.T. Redmond, "Expert Systems for Bond Rating : A Comparative Analysis of Statistical, Rule-based and Neural Network Systems," Expert Systems, Vol.10, No.3(1993), pp.167-172.   DOI   ScienceOn
18 Masters, T., Practical Neural Network Recipes in C++, Academic Press, Boston, 1993.
19 Ou, G. and Y.L. Murphey, "Multi-class pattern classification using neural networks," Pattern Recognition, Vol.40, No.1(2007), pp. 4-18.   DOI   ScienceOn
20 Singleton, J.C. and A.J. Surkan, "Neural Networks for Bond Rating Improved by Multiple Hidden Layers," Proceedings of the IEEE International Conference on Neural Networks, Vol.2(1990), pp.163-168.