References
- Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1), 19-41. https://doi.org/10.1006/dspr.1999.0361
- Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788-798. https://doi.org/10.1109/TASL.2010.2064307
- Prince, S. J., & Elder, J. H. (2007). Probabilistic linear discriminant analysis for inferences about identity. Proceedings of the 11th IEEE International Conference on Computer Vision. October, 2007.
- Garcia-Romero, D., Zhang, X., McCree, A., & Povey, D. (2014). Improving speaker recognition performance in the domain adaptation challenge using deep neural networks. Proceedings of Spoken Language Technology Workshop. December, 2014.
- Lei, Y., Scheffer, N., Ferrer, L., & McLaren, M. (2014). A novel scheme for speaker recognition using a phonetically-aware deep neural network. Proceedings of International Conference on Acoustics, Speech and Signal Processing. May, 2014.
- Hasan, T., Saeidi, R., Hansen, J. H., & van Leeuwen, D. A. (2013). Duration mismatch compensation for i-vector based speaker recognition systems. Proceedings of International Conference on Acoustics, Speech and Signal Processing. May, 2013.
- Kanagasundaram, A., Vogt, R. J., Dean, D. B., & Sridharan, S. (2012). PLDA based speaker recognition on short utterances. Proceedings of Odyssey: The Speaker and Language Recognition Workshop. June, 2012.
- Kanagasundaram, A., Dean, D., Sridharan, S., Gonzalez- Dominguez, J., Gonzalez-Rodriguez, J., & Ramos, D. (2014). Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques. Speech Communication, 59, 69-82. https://doi.org/10.1016/j.specom.2014.01.004
- Kenny, P., Stafylakis, T., Ouellet, P., Alam, M. J., & Dumouchel, P. (2013). PLDA for speaker verification with utterances of arbitrary duration. Proceedings of International Conference on Acoustics, Speech and Signal Processing. May, 2013.
- Garcia-Romero, D., McCree, A., Shum, S., Brummer, N., & Vaquero, C. (2014). Unsupervised domain adaptation for i-vector speaker recognition. Proceedings of Odyssey: The Speaker and Language Recognition Workshop. June, 2014.
- Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., & Dumouchel, P. (2009). Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. Proceedings of INTERSPEECH. September, 2009.
- Peddinti, V., Povey, D., & Khudanpur, S. (2015). A time delay neural network architecture for efficient modeling of long temporal contexts. Proceedings of INTERSPEECH. 2015.
- Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., & Lang, K. J. (1989). Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(3), 328-339. https://doi.org/10.1109/29.21701
- Kenny, P., Ouellet, P., Dehak, N., Gupta, V., & Dumouchel, P. (2008). A study of interspeaker variability in speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 16(5), 980-988. https://doi.org/10.1109/TASL.2008.925147
- Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1), 19-41. https://doi.org/10.1006/dspr.1999.0361
- Kenny, P., Gupta, V., Stafylakis, T., Ouellet, P., & Alam, J. (2014). Deep neural networks for extracting baum-welch statistics for speaker recognition. Proceedings of Odyssey: The Speaker and Language Recognition Workshop. June, 2014.
- Paul, D. B., & Baker, J. M. (1992). The design for the wall street journal based csr corpus. Proceedings of the workshop on Speech and Natural Language (pp. 357-362).
- Pitz, M., & Ney, H. (2005). Vocal tract normalization equals linear transformation in cepstral space. IEEE Transactions on Speech and Audio Processing, 13(5), 930-944. https://doi.org/10.1109/TSA.2005.848881
- Molau, S., Kanthak, S., & Ney, H. (2000). Efficient vocal tract normalization in automatic speech recognition. Proceedings of the ESSV'00. 2000.
- Jaitly, N., & Hinton, G. E. (2013). Vocal tract length perturbation (VTLP) improves speech recognition. Proceedings of ICML Workshop on Deep Learning for Audio, Speech and Language. June, 2013.
- Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., & vesely, K. (2011). The Kaldi speech recognition toolkit. Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding. 2011.
- Cieri, C., Miller, D., & Walker, K. (2004). The fisher corpus: resource for the next generations of speech-to-text. Language Resources and Evaluation Conference, 4, 69-71.
- Poddar, A., Sahidullah, M., & Saha, G. (2015). Performance comparison of speaker recognition systems in presence of duration variability. Proceedings of IEEE India Conference(INDICON). December, 2015.
- Kenny, P., Boulianne, G., Ouellet, P.& Dumouchel, P. (2007). Speaker and session variability in GMM-based speaker verification. IEEE Transactions on Audio, Speech and Language Processing, 15(4), 1448-1460. https://doi.org/10.1109/TASL.2007.894527
- National Institute of Standards and Technology. (2008). The NIS T year 2008 speaker recognition evaluation plan 2008. Retrieved from http://www.itl.nist.gov/iad/mig/tests/sre/2008/sre08_evalplan_release4.pdf on December 11, 2016.
- Snyder, D., Garcia-Romero, D., & Povey, D. (2015). Time delay deep neural network-based universal background models for speaker recognition. Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding. December, 2015.