Browse > Article
http://dx.doi.org/10.6109/jkiice.2014.18.6.1294

Speaker Recognition Performance Improvement by Voiced/Unvoiced Classification and Heterogeneous Feature Combination  

Kang, Jihoon (Department of Electronics Engineering, Gyeongsang National University)
Jeong, Sangbae (Department of Electronics Engineering, Gyeongsang National University)
Abstract
In this paper, separate probabilistic distribution models for voiced and unvoiced speech are estimated and utilized to improve speaker recognition performance. Also, in addition to the conventional mel-frequency cepstral coefficient, skewness, kurtosis, and harmonic-to-noise ratio are extracted and used for voiced speech intervals. Two kinds of scores for voiced and unvoiced speech are linearly fused with the optimal weight found by exhaustive search. The performance of the proposed speaker recognizer is compared with that of the conventional recognizer which uses mel-frequency cepstral coefficient and a unified probabilistic distribution function based on the Gassian mixture model. Experimental results show that the lower the number of Gaussian mixture, the greater the performance improvement by the proposed algorithm.
Keywords
speaker recognition; voiced/unvoiced classification; skewness; kurtosis; harmonic-to-noise ratio;
Citations & Related Records
연도 인용수 순위
  • Reference
1 W. Kleijin and K. Paliwal, Speech Coding and Synthesis, 2nd ed., Elsevier, 1998.
2 C. Nikias and A. Petropulu, Higher-Order Spectra Analysis, Prentice Hall, 1993.
3 C. Ferrand, "Harmonics-to-Noise Ratio: An Index of Vocal Aging," Journal of Voice, Vol. 16, No. 4, pp. 480-477, 2002.   DOI   ScienceOn
4 D. Raynolds and R. Rose, "Robust text-independent speaker identification using Gaussian mixture speaker models," IEEE Trans. Speech and Audio Proc., Vol. 3, No. 1, pp. 72-83, 1995.   DOI   ScienceOn
5 L. Rabiner and B. Juang, Fundamentals of Speech Recognition, Prentice Hall, 1993.
6 N. Ahmed, "How I came up with the discrete cosine transform," Digital Signal Processing, Vol. 1, No. 1, pp. 4-9, 1991.   DOI   ScienceOn
7 L. Rabiner and R. Schafer, Theory and Applications of Digital Speech Processing, Prentice Hall, 2010.
8 T. Kinnunen and H. Li, "An overview of text-independent speaker recognition: From features to supervectors," Speech Communication Vol. 52, No. 1, pp. 12-40, 2010.   DOI   ScienceOn