Acknowledgement
Supported by : Youngsan University
References
- S. Geman, E. Bienenstock, and R. Doursat. "Neural networks and the bias/variance dilemma", Neural Computation, Vol. 4, No. 1, pp. 1-58, Jan. 1992. https://doi.org/10.1162/neco.1992.4.1.1
- S. E. Fahlman and C. Lebiere, "The Cascade-Correlation learning architecture", Advances in Neural Information Processing Systems 5, San Mateo, CA, Morgan Kaufman Publishers Inc, 1993. pp. 524-532.
- Y. L. Cun, S. D. John, and S. A. Solla, "Second order derivatives for network pruning", Advances in Neural Information Processing Systems 5, San Mateo, CA,. Morgan Kaufman Publishers Inc, 1990, pp. 598-6055
- Y. L. Cun, S. D. John, and S. A. Solla, "Optimal brain damage", Advances in Neural Information Processing Systems 5, San Mateo, CA, Morgan Kaufman Publishers Inc, 1993, pp. 164-171.
- S. J. Nowlan and G. E. Hinton, "Simplifying neural networks by soft weight-sharing", Neural Computation, Vol. 4, No. 4, pp.473-493, July 1992. https://doi.org/10.1162/neco.1992.4.4.473
- A. Krogh and J. A. Hertz, "A simple weight decay can improve generalization", Advances in Neural Information Processing Systems 5, San Mateo, CA, Morgan Kaufman Publishers Inc, 1993, pp. 950-957.
- A.S. Weigend, D. E. Rumelhart, and B. A. Huberman, "Generalization by weight-elimination with application to forecasting", Advances in Neural Information Processing Systems 5, San Mateo, CA, Morgan Kaufman Publishers Inc, 1993, pp. 875-882.
- N. Morgan and H. Bourlard, "Generalization and parameter estimation in feedforward nets: Some experiments", Advances in Neural Information Processing Systems 5, San Mateo, CA, Morgan Kaufman Publishers Inc, 1990, pp. 630-637.
- R. Russel, "Pruning algorithms a survey", IEEE Transactions on Neural Networks, Vol. 4, No. 5, pp.740-746, Sep. 1993. https://doi.org/10.1109/72.248452
- http://www.cis.pku.edu.cn/faculty/vision/zlin/1983 A Method of Solving a Convex Programming Problem with Convergence Rate O(k^(-2))_Nesterov.pdf
- B. Polyak "Some methods of speeding up the convergence of iteration methods", USSR Computational Mathematics and Mathematical Physics, Vol. 4, Issue 5, pp. 1-17. 1964. https://doi.org/10.1016/0041-5553(64)90137-5
- Y. Nesterov, "Gradient Methods for Minimizing Composite Objective Function", http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.332.931&rep=rep1&type=pdf.
- G, Hinton, "Neural Networks for Machine Learning", http://goo.gl/RsQeis; video: https://goo.gl/XUbIyJ.
- D. P. KIngma, "ADAM: A Method for Stochastic Optimization," https://arxiv.org/pdf/14126980.pdf.