Browse > Article
http://dx.doi.org/10.5762/KAIS.2020.21.6.679

Performance Improvement of Speaker Recognition by MCE-based Score Combination of Multiple Feature Parameters  

Kang, Ji Hoon (Defense Agency for Technology and Quality)
Kim, Bo Ram (Defense Agency for Technology and Quality)
Kim, Kyu Young (Defense Agency for Technology and Quality)
Lee, Sang Hoon (Defense Agency for Technology and Quality)
Publication Information
Journal of the Korea Academia-Industrial cooperation Society / v.21, no.6, 2020 , pp. 679-686 More about this Journal
Abstract
In this thesis, an enhanced method for the feature extraction of vocal source signals and score combination using an MCE-Based weight estimation of the score of multiple feature vectors are proposed for the performance improvement of speaker recognition systems. The proposed feature vector is composed of perceptual linear predictive cepstral coefficients, skewness, and kurtosis extracted with lowpass filtered glottal flow signals to eliminate the flat spectrum region, which is a meaningless information section. The proposed feature was used to improve the conventional speaker recognition system utilizing the mel-frequency cepstral coefficients and the perceptual linear predictive cepstral coefficients extracted with the speech signals and Gaussian mixture models. In addition, to increase the reliability of the estimated scores, instead of estimating the weight using the probability distribution of the convectional score, the scores evaluated by the conventional vocal tract, and the proposed feature are fused by the MCE-Based score combination method to find the optimal speaker. The experimental results showed that the proposed feature vectors contained valid information to recognize the speaker. In addition, when speaker recognition is performed by combining the MCE-based multiple feature parameter scores, the recognition system outperformed the conventional one, particularly in low Gaussian mixture cases.
Keywords
Speaker Recognition; GMM; Glottal Flow; MCE; Score Combination;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Sonali T. Saste1 and Prof. S. M. Jagdale, "Comparative Study of Different Techniques in Speaker Recognition: Review," International Journal of Advanced Engineering, Management and Science (IJAEMS), vol. 3, no. 3, pp. 284-287, March 2017. DOI: https://dx.doi.org/10.24001/ijaems.3.3.25
2 B. Putra and Suyanto, "Implementation of secure speaker verification at web login page using Mel Frequency Cepstral coefficient-Gaussian Mixture Model (MFCC-GMM)," International Conference on Instrumentation Control and Automation (ICA), pp. 358-363, November 2011. DOI: http://dx.doi.org/10.1109/ICA.2011.6130187
3 H. Hermansky, "Perceptual linear predictive (PLP) analysis of speech," The Journal of the Acoustical Society of America, vol. 87, no. 4, pp. 1738-1752, April 1990. DOI: http://dx.doi.org/10.1121/1.399423   DOI
4 C. L. Nikias, "Higher-Order Spectral Analysis," Proceedings of the 15th Annual International Conference of the IEEE Engineering in Medicine and Biology Societ, pp. 319-319, October 1993. DOI: http://dx.doi.org/10.1109/IEMBS.1993.978564
5 T. Kinnunen and P. Alku, "On separation glottal source and vocal tract information in telephony speaker verification," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4545-4548, April 2009. DOI: https://doi.org/10.1109/ICASSP.2009.4960641
6 D. Raynolds and R. C. Rose, "Robust text-independent speaker identification using Gaussian mixture speaker models," IEEE Transactions on Speech and Audio Processing, vol. 3, no. 1, pp. 72-83, January 1995. DOI: https://doi.org/10.1109/89.365379   DOI
7 P. Salmela, K. Laurila, M. Lehtokangas and J. Saarinen, "On string level MCE training in MLP/HMM speech recognition system," IEEE International Conference on Systems, Man, and Cybernetics, vol. 2, pp. 165-171, October 1999. DOI: https://doi.org/10.1109/ICSMC.1999.825227
8 K. Dhameliya and N. Bhatt, "Feature Extraction And Classification Techniques for Speaker Recognition: A Review," IEEE International Conference on Electrical, Electronics, Signal, Communication and Optimization (EESCO), pp. 1-4, January 2015. DOI:http://dx.doi.org/10.1109/EESCO.2015.7253831
9 V. Tiwari, "MFCC and its applications in speaker recognition," IEEE International Journal on Emerging Technologies, vol. 1, no. 7, pp. 33-37, May 2013.
10 K. Kau and N. Jain, "Feature Extraction and Classification for Automatic Speaker Recognition System - A Review," International Journal of Advanced Research in Computer Science and Software Engineering, vol. 5, no. 1, pp. 1-6, January 2015.
11 T. Kinnunen and H. Li, "An overview of text-independent speaker recognition: From features to supervectors," Speech Communication, vol. 52, no. 1, pp. 12-40, January 2010. DOI: http://dx.doi.org/10.1016/j.specom.2009.08.009   DOI