1 |
A. Acero et al., "Live search for mobile: web services by voice on the cellphone," in Proceeding of the Interspeech, Brisbane, Australia, pp. 5256-5259, 2008.
|
2 |
J. Jiang et al., Automatic online evaluation of intelligent assistants, in Opportunities and Challenges for Next-Generation Applied Intelligence, Berlin, Germany: Springer, pp. 285-290, 2009.
|
3 |
S. Kim and J. Ahn, "Speech Recognition System in Car Noise Environment," The Journal of Digital Contents Society, Vol. 10, No. 1, pp. 121-127, Mar. 2009.
|
4 |
L. Rabiner and B. Juang, Fundamentals of Speech Recognition, 1st ed. Englewood Cliffs, NJ: Prentice Hall, 1993.
|
5 |
D. Su, X. Wu, and L. Xu, "GMM-HMM acoustic model training by a two level procedure with gaussian components determined by automatic model selection," in Proceeding of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas: TX, pp. 4890-4893, 2010.
|
6 |
H. Hermansky, D. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional HMM systems," in Proceeding of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Istanbul, Turkey, pp. 1635-1638, 2000.
|
7 |
T. Mikolov and G. Zweig, Context dependent recurrent neural network language model, Microsoft Research, Redmond: WA, Technical Report MSR-TR-2012-92, 2012.
|
8 |
A. Graves et al., "Hybrid speech recognition with deep bidirectional LSTM," in Proceeding of the IEEE Automatic Speech Recognition and Understanding Workshop, Olomouc, Czech Republic, pp. 273-278, 2013.
|
9 |
G. Hinton et al., "Deep Neural Networks for Acoustic Modeling in Speech Recognition," The IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 82-97, Oct.2012.
DOI
|
10 |
L. Deng, G. Hinton, and B. Kingsbury, "New types of deep neural network learning for speech recognition for speech recognition and related applications: An overview," in Proceeding of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, Canada, pp. 8599-8603, May. 2013.
|
11 |
A. Graves and N. Jaitly, "Towards end-to-end speech recognition with recurrent neural networks," in Proceeding of the 31st International Conference on Machine Learning, Beijing, China, pp. 1764-1772, 2014.
|
12 |
A. Graves, Supervised sequence labelling with recurrent neural networks, Ph.D. dissertation, Technische Universitat Munchen, Munchen, Germany, 2008.
|
13 |
A. Graves et. al, "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks," in Proceeding of the 23rd International Conference on Machine Learning, Pittsburgh: PA, pp. 369-376, 2006.
|
14 |
S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, Vol. 9, No. 8, pp. 1735-1780, Nov. 1997.
DOI
|
15 |
Y. Rao, A. Senior, and H. Sak, "Flat start training of CD-CTC-sMBR LSTM RNN acoustic models," in Proceeding of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Shanghai, China, pp. 5405-5409, 2016.
|
16 |
H. Sak, A. Senior, and F. Beaufays, "Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition," arXiv:1402.1128, pp. 1-5, Feb. 2014.
|
17 |
M. Liwicki, A. Graves, H. Bunke and J. Schmidhuber, "A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks," in Proceeding of the 9th International Conference on Document Analysis and Recognition, Curitiba, Brazil, pp. 367-371, 2017.
|
18 |
Y. Miao et al., "EESEN: End-to-end speech recognition using deep RNN models and WFST-based decoding," in Proceeding of the IEEE Automatic Speech Recognition and Understanding Workshop, Scottsdale: AZ, pp. 167-174, 2015.
|