Tuning the Architecture of Neural Networks for Multi-Class Classification

Jeong, Chulwoo;Min, Jae H.;

doi:10.7737/JKORMS.2013.38.1.139

Journal of the Korean Operations Research and Management Science Society (한국경영과학회지)

Volume 38 Issue 1
/
Pages.139-152
/
2013
/
1225-1119(pISSN)
/
2733-4759(eISSN)

The Korean Operations Research and Management Science Society (한국경영과학회)

DOI QR Code

Tuning the Architecture of Neural Networks for Multi-Class Classification

다집단 분류 인공신경망 모형의 아키텍쳐 튜닝

정철우 (한국국방연구원) ;
민재형 (서강대학교 경영전문대학원)

Received : 2012.10.09
Accepted : 2012.11.09
Published : 2013.03.31

https://doi.org/10.7737/JKORMS.2013.38.1.139 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The purpose of this study is to claim the validity of tuning the architecture of neural network models for multi-class classification. A neural network model for multi-class classification is basically constructed by building a series of neural network models for binary classification. Building a neural network model, we are required to set the values of parameters such as number of hidden nodes and weight decay parameter in advance, which draws special attention as the performance of the model can be quite different by the values of the parameters. For better performance of the model, it is absolutely necessary to have a prior process of tuning the parameters every time the neural network model is built. Nonetheless, previous studies have not mentioned the necessity of the tuning process or proved its validity. In this study, we claim that we should tune the parameters every time we build the neural network model for multi-class classification. Through empirical analysis using wine data, we show that the performance of the model with the tuned parameters is superior to those of untuned models.

Keywords

References

김다윗, 민성환, 한인구, "신경망 분리모형과 사례 기반추론을 이용한 기업 신용 평가", Journal of Information Technology Applications and Management , 제14권, 제2호(2007), pp.151-168.
Altman, E.I., G. Marco, and F. Varetto, "Corporate distress diagnosis comparisons using discriminant analysis and neural networks," Journal of Banking and Finance, Vol.18, No.3 (1994), pp.505-529. https://doi.org/10.1016/0378-4266(94)90007-8
Bartlett, P.L., "For valid generalization, the size of the weights is more important than the size of the network," in M.C. Mozer, M.I. Jordan and T. Petsche (Eds.), Advances in Neural Information Processing Systems, Vol.9, The MIT Press, Cambridge, MA, 1997.
Berry, M.J.A. and G.S. Linoff, Data Mining Techniques : For Marketing, Sales, and Cus tomer Relationship Management, Wiley, Indiana, 2004.
Bishop, C.M., Neural networks for pattern recognition, Oxford University Press, New York, 1995.
Cheng, B. and D.M. Titterington, "Neural Networks : A Review from a Statistical Perspective," Statistical Science, Vol.9, No.1(1994), pp.2-30. https://doi.org/10.1214/ss/1177010638
Cortez, P., A. Cerdeira, F. Almeida, T. Matos, and J. Reis, "Modeling wine preferences by data mining from physicochemical properties," Decision Support Systems, Vol.47, No.4 (2009), pp.547-553. https://doi.org/10.1016/j.dss.2009.05.016
Cybenko, G., "Approximation by superpositions of a sigmoid function," Mathematics of Control, Signals, and Systems, Vol.2, No.4 (1989), pp.303-314. https://doi.org/10.1007/BF02551274
De Veaux, R.D., J. Schumi, J. Schweinsberg, and L.H. Ungar, "Prediction intervals for neural networks via nonlinear regression," American Statistical Association and the American Society for Quality, Vol.40, No.44(1998), pp.273- 282.
De Villiers, J. and E. Barnard, "Backpropagation neural nets with one and two hidden layers," IEEE Transactions on Neural Networks, Vol.4, No.1(1993), pp.136-141. https://doi.org/10.1109/72.182704
Geman, S., E. Bienenstock, and R. Doursat, "Neural networks and the bias/variance dilemma," IEEE Neural Computation, Vol.4, No.1(1992), pp.1-58. https://doi.org/10.1162/neco.1992.4.1.1
Hinton, G.E., "Learning translation invariant recognition in massively parallel networks," In J.W. de Bakker, A.J. Nijman and P.C. Treleaven (Eds.), Proceedings PARLE Conference on Parallel Architectures and Languages Europe, Springer-Verlag, Berlin, 1987.
Hush, D.R. and B.G. Horne, "Progress in Supervised Neural Networks," IEEE Signal Processing Magazine, Vol.10, No.1(1993), pp.8-39.
Jeong, C., J.H. Min, and M.S. Kim, "A tuning method for the architecture of neural network models incorporating GAM and GA as applied to bankruptcy prediction," Expert Systems with Applications, Vol.39, No.3(2012), pp.3650-3658. https://doi.org/10.1016/j.eswa.2011.09.056
Jo, H. and I. Han, "Integration of case-based forecasting, neural network, and discriminant analysis for bankruptcy prediction," Expert Systems with Applications, Vol.11, No.4(1996), pp.415-422. https://doi.org/10.1016/S0957-4174(96)00056-5
Kim, J., H.R. Weistroffer, and R.T. Redmond, "Expert Systems for Bond Rating : A Comparative Analysis of Statistical, Rule-based and Neural Network Systems," Expert Systems, Vol.10, No.3(1993), pp.167-172. https://doi.org/10.1111/j.1468-0394.1993.tb00093.x
Masters, T., Practical Neural Network Recipes in C++, Academic Press, Boston, 1993.
Ou, G. and Y.L. Murphey, "Multi-class pattern classification using neural networks," Pattern Recognition, Vol.40, No.1(2007), pp. 4-18. https://doi.org/10.1016/j.patcog.2006.04.041
Singleton, J.C. and A.J. Surkan, "Neural Networks for Bond Rating Improved by Multiple Hidden Layers," Proceedings of the IEEE International Conference on Neural Networks, Vol.2(1990), pp.163-168.
Zhao, H., A. Sinha, and W. Ge, "Effects of feature construction on classification performance : An empirical study in bank failure prediction," Expert Systems with Applications, Vol.36, No.2(2009), pp.2633-2644. https://doi.org/10.1016/j.eswa.2008.01.053

Journal of the Korean Operations Research and Management Science Society (한국경영과학회지)

Tuning the Architecture of Neural Networks for Multi-Class Classification

다집단 분류 인공신경망 모형의 아키텍쳐 튜닝

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)