Multiple Discriminative DNNs for I-Vector Based Open-Set Language Recognition |
Kang, Woo Hyun
(Seoul National University Department of Electrical and Computer Engineering and Institute of New Media and Communications)
Cho, Won Ik (Seoul National University Department of Electrical and Computer Engineering and Institute of New Media and Communications) Kang, Tae Gyoon (Seoul National University Department of Electrical and Computer Engineering and Institute of New Media and Communications) Kim, Nam Soo (Seoul National University Department of Electrical and Computer Engineering and Institute of New Media and Communications) |
1 | G. Hinton, L. Deng, D. Yu, A. Mohamed, et. al., "Deep neural networks for acoustic modeling in speech recognition," IEEE Sig. Process. Mag., vol. 29, no. 6, pp. 82-97, Nov. 2012. DOI |
2 | K. H. Lee, S. J. Kang, W. H. Kang, N. S. Kim, and S. J. Yang, "DNN-based feature compensation using environmental parameter," in Proc. KICS ICC 2015, pp. 72-73, Gangwon, Korea, Jan. 2016. |
3 | Y. Lei, N. Scheffer, L. Ferrer, and M. McLaren, "A novel scheme for speaker recognition using a phonetically-aware deep neural network," in Proc. ICASSP 2014, pp. 1714-1718, Florence, Italy, May 2014. |
4 | J. Wang, D. Wang, T. F. Zheng, and F. Bie, DNN-based discriminative scoring for speaker recognition based on i-vector, CSLT, Tech. Rep. 20150002, Jan. 2015. |
5 | O. Ghahabi and J. Hernando, "Deep belief networks for i-vector based speaker recognition," in Proc. ICASSP 2014, pp. 1700-1704, Florence, Italy, May 2014. |
6 | W. H. Kang, K. H. Lee, T. G. Kang, S. J. Kang, N. S. Kim, and K. J. Shin, "Speaker age regression using i-vectors trained with MFCC and pitch," in Proc. KICS ICC 2015, pp. 967-968, Jeju, Korea, Jun. 2015. |
7 | W. H. Kang, K. H. Lee, T. G. Kang, and N. S. Kim, "NN based speaker age classification using i-vectors," in Proc. KICS ICC 2015, pp. 589-590, Seoul, Korea, Nov. 2015. |
8 | I. Lopez-Moreno, J. Gonzalez-Dominguez, O. Plchot, D. Martinez, et. al., "Automatic language identification using deep neural networks," in Proc. ICASSP 2014, pp. 5374-5378, Florence, Italy, May 2014. |
9 | C. Chang and C. Lin, "LIBSVM: a library for support vector machines," ACM TIST, vol. 2, no. 3, pp. 1-39, Apr. 2011. |
10 | W. M. Campbell, E. Singer, P. Torres- Carrasquillo, and D. A. Reynolds, "Language recognition with support vector machines," in Proc. Odyssey 2004, pp. 41-44, Toledo, Spain, May-Jun. 2004. |
11 | The 2015 Language Recognition i-Vector Machine Learning Challenge(2015), Retrieved Dec. 29, 2015, from http://www.nist.gov/itl/iad/mig/upl oad/lre_ivectorchallenge_rel_v2.pdf |
12 | N. Dehak, P. Kenny, R. Dehak, P. Dumouchei, and P. Ouellet, "Front-end factor analysis for speaker verification," IEEE Trans. Audio, Speech, Language Process., vol. 19, no. 4, pp. 788-798, May 2011. DOI |
13 | A. O. Hatch, S. S. Kajarekar, and A. Stolcke, "Within-class covariance normalization for SVM-based speaker recognition," in Proc. Interspeech, pp. 2-5, 2006. |
14 | D. Reynolds, T. Quatieri, and R. Dunn, "Speaker verification using adapted gaussian mixture models," Digital Sign. Process., vol. 10, pp. 19-41, Jan. 2000. DOI |
15 | R. Salakhutdinov, "Learning deep generative models," Ph. D. Dissertation, University of Toronto, 2009. |
16 | N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, et. al., "Dropout: a simple way to prevent neural networks from overfitting," JMLR, vol. 15, no. 1, pp. 1929-1958, Jun. 2014. |
17 | S. Furui, Speaker recognition(2008), Retrieved Jul., 12, 2016, from http://www.scholarpedia.org/article/Speaker_recognition |