1 |
Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1 (pp. 1097-1105).
|
2 |
Sainath, T., Kingsbury, B., Saon, G., Soltau, H., Mohamed, A., Dahl, G., & Ramabhadran, B. (2015). Deep convolutional neural networks for large-scale speech tasks. Neural Networks, 64, 39-48.
|
3 |
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770-778).
|
4 |
Srivastava, R., Greff, K., & Schmidhuber, J. (2015). Training very deep networks. Proceedings of the Advances in Neural Information Processing Systems 28 (pp. 2377-2385).
|
5 |
Huang, G., Liu, Z., Maaten, L., & Weinberger, K. (2017). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4700-4708).
|
6 |
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A., & Fei-Fei, L. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211-252.
DOI
|
7 |
Park, S., Jeong, Y., & Kim, H. (2017). Multiresolution CNN for reverberant speech recognition. Proceedings of the Conference of The Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques.
|
8 |
Robinson, T., Fransen, J., Pye, D., Foote, J., & Renals, S. (1995). WSJCAMO: A British English speech corpus for large vocabulary continuous speech recognition. 1995 International Conference on Acoustics, Speech, and Signal Processing (pp. 81-84). Detroit, MI. 1995.
|
9 |
Lincoln, M., McCowan, I., Vepa, J., & Maganti, H. (2005). The multi-channel Wall Street Journal audio visual corpus (MC-WSJ-AV): Specification and initial experiments. Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding. San Juan (pp. 357-362).
|
10 |
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., & Vesely, K. (2011). The Kaldi speech recognition toolkit. Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2011) (p. 4). Hawaii. 11-15 December, 2011.
|
11 |
Yu, D., Yao, K., & Zhang, Y. (2015). The computational network toolkit. IEEE Signal Processing Magazine, 32(6), 123-126.
DOI
|
12 |
Qian, Y., Bi, M., Tan, T., & Yu, K. (2016). Very deep convolutional neural networks for noise robust speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 24(12), 2263-2276.
DOI
|