Initialization by using truncated distributions in artificial neural network |
Kim, MinJong
(Department of Applied Statistics, Chung-Ang University)
Cho, Sungchul (Department of Applied Statistics, Chung-Ang University) Jeong, Hyerin (Department of Applied Statistics, Chung-Ang University) Lee, YungSeop (Department of Statistics, Dongguk University) Lim, Changwon (Department of Applied Statistics, Chung-Ang University) |
1 | He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1026-1034). |
2 | Humbird, K. D., Peterson, J. L., and McClarren, R. G. (2018). Deep neural network initialization with decision trees, EEE Transactions on Neural Networks and Learning Systems, 30, 1286-1295. |
3 | Hayou, S., Doucet, A., and Rousseau, J. (2018). On the selection of initialization and activation function for deep neural networks. arXiv preprint arXiv:1805.08266. |
4 | Krahenbuhl, P., Doersch, C., Donahue, J., and Darrell, T. (2015). Data-dependent initializations of convolutional neural networks. arXiv preprint arXiv:1511.06856. |
5 | Krizhevsky, A. and Hinton, G. (2009). Learning Multiple Layers of Features from Tiny Images (Vol. 1, No. 4, p. 7) Technical report, University of Toronto. |
6 | LeCun, Y., Bottou, L., Orr, G., and Muller, K. (1998a). Efficient backprop in neural networks: Tricks of the trade (Orr, G. and Muller, K., eds.), Lecture Notes in Computer Science, 1524(98), 111. |
7 | LeCun, Y., Cortes, C., and Burges, C. J. (1998b). The MNIST Database of Handwritten Digits. |
8 | Mishkin, D. and Matas, J. (2015). All you need is a good init. arXiv preprint arXiv:1511.06422. |
9 | Sutskever, I., Martens, J., Dahl, G., and Hinton, G. (2013). On the importance of initialization and momentum in deep learning, In International Conference on Machine Learning (pp. 1139-1147). |
10 | Clevert, D. A., Unterthiner, T., and Hochreiter, S. (2015). Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289 |
11 | Glorot, X. and Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (pp. 249-256). |
12 | Goodfellow, I. J., Vinyals, O., and Saxe, A. M. (2014). Qualitatively characterizing neural network optimization problems. arXiv preprint arXiv:1412.6544. |
13 | Hanin, B. and Rolnick, D. (2018). How to start training: The effect of initialization and architecture. In Advances in Neural Information Processing Systems (pp. 571-581). |