1 |
Chan, W., Jaitly, N., Le, Q., & Vinyals, O. (2016, March). Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. Proceedings of the 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4960-4964). Shanghai, China.
|
2 |
Tjandra, A., Sakti, S., & Nakamura, S. (2017, December). Listening while speaking: Speech chain by deep learning. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 301-308). Okinawa, Japan.
|
3 |
Vesely, K., Hannemann, M., & Burget, L. (2013, December). Semi-supervised training of deep neural networks. IEEE Workshop on Automatic Speech Recognition and Understanding (pp. 267-272). Olomouc, Czech.
|
4 |
Watanabe, S., Hori, T., Karita, S., Hayashi, T., Nishitoba, J., Unno, Y., Soplin, N. E. Y., … Ochiai, T. (2018). ESPnet: End-to-end speech processing toolkit [Computing research repository]. Retrieved from http://arxiv.org/abs/1804.00015
|
5 |
Graves, A., Fernández, S., Gomez, F., & Schmidhuber, J. (2006, June). Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. Proceedings of the 23rd International Conference on Machine Learning (pp. 369-376). Pittsburgh, PA.
|
6 |
Graves, A., Mohamed, A. R., & Hinton, G. (2013, May). Speech recognition with deep recurrent neural networks. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6645-6649). Vancouver, Canada.
|
7 |
Gulcehre, C., Firat, O., Xu, K., Cho, K., Barrault, L., Lin, H., Bougares, F., ...Bengio, Y. (2015, June). On using monolingual corpora in neural machine translation [Computing research repository]. Retrieved from https://arxiv.org/pdf/1503.03535.pdf
|
8 |
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., Senior, A., ... Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82-97.
DOI
|
9 |
Karita, S., Watanabe, S., Iwata, T., Ogawa, A., & Delcroix, M. (2018, September). Semi-supervised end-to-end speech recognition. Proceedings of the International Conference on Spoken Language Processing (INTERSPEECH) (pp. 2-6). Taipei, Taiwan.
|
10 |
Miao, Y., Gowayyed, M., & Metez, F. (2015, October). EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding. Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 167-174). Scottsdale, AZ.
|
11 |
Mikolov, T., Karafiat, M., Burget, L., Cernocky, J., & Khudanpur, S. (2010, September). Recurrent neural network based language model. Proceedings of the 11th Annual Conference of the International Speech Communication Association (pp. 1045-1048). Makuhari, Japan.
|