Browse > Article
http://dx.doi.org/10.6109/jkiice.2017.21.6.1149

Speaker Independent Recognition Algorithm based on Parameter Extraction by MFCC applied Wiener Filter Method  

Choi, Jae-Seung (Division of Smart Electrical and Electronic Engineering, Silla University)
Abstract
To obtain good recognition performance of speech recognition system under background noise, it is very important to select appropriate feature parameters of speech. The feature parameter used in this paper is Mel frequency cepstral coefficient (MFCC) with the human auditory characteristics applied to Wiener filter method. That is, the feature parameter proposed in this paper is a new method to extract the parameter of clean speech signal after removing background noise. The proposed method implements the speaker recognition by inputting the proposed modified MFCC feature parameter into a multi-layer perceptron network. In this experiments, the speaker independent recognition experiments were performed using the MFCC feature parameter of the 14th order. The average recognition rates of the speaker independent in the case of the noisy speech added white noise are 94.48%, which is an effective result. Comparing the proposed method with the existing methods, the performance of the proposed speaker recognition is improved by using the modified MFCC feature parameter.
Keywords
Mel frequency cepstral coefficient; Wiener filter; speaker recognition; speaker independent; feature parameter;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. R. Gottlieb and G. Friedland, "On the Use of Artificial Conversation Data for Speaker Recognition in Cars," IEEE International Conference on Semantic Computing, pp. 124-128, Sept. 2009.
2 P. Day and A. K. Nandi, "Robust Text-Independent Speaker Verification Using Genetic Programming," IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 1, pp. 285-295, January 2007.   DOI
3 P. Song, Y. Jin, C. Zha and L. Zhao, "Speech emotion recognition method based on hidden factor analysis," Electronics Letters, vol. 51, no. 1, pp. 112-114, Jan. 2015.   DOI
4 T. Yamada, M. Kumakura and N. Kitawaki, "Performance Estimation of Speech Recognition System Under Noise Conditions Using Objective Quality Measures and Artificial Voice," IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 6, pp. 2006-2013, October 2006.   DOI
5 J. L. Carmona, J. Barker, A. M. Gomez and Ning Ma, "Speech Spectral Envelope Enhancement by HMM-Based Analysis/Resynthesis," IEEE Signal Processing Letters, vol. 20, no. 6, pp. 563-566, June 2013.   DOI
6 J. Chen, J. Benesty, Y. Huang and S. Doclo, "New insights into the noise reduction Wiener filter," IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1218-1234, July 2006.   DOI
7 M. Krawczyk-Becker and T. Gerkmann, "On MMSE-Based Estimation of Amplitude and Complex Speech Spectral Coefficients Under Phase-Uncertainty," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 12, pp. 2251-2262, December 2016.   DOI
8 H. K. Kim, S. H. Choi and H. S. Lee, "On approximating line spectral frequencies to LPC cepstral coefficients," IEEE Transactions on Speech and Audio Processing, vol. 8, no. 2, pp. 195-199, March 2000.   DOI
9 K. V. Veena and M. Dominic, "Speaker Identification and Verification of Noisy Speech Using Multitaper MFCC and Gaussian Models," IEEE International Conference on Power, Instrumentation, Control and Computing, pp. 1-4, Dec. 2015.
10 W. W. Hung and H. C. Wang, "On the use of weighted filter bank analysis for the derivation of robust MFCCs," IEEE Signal Processing Letters, vol. 8, no. 3, pp. 70-73, Mar. 2001.   DOI
11 M. Holmberg, D. Gelbart and W. Hemmert, "Automatic speech recognition with an adaptation model motivated by auditory processing," IEEE Trans. on Audio, Speech, and Language Processing, vol. 14, no. 1, pp. 43-49, Jan. 2006.   DOI
12 S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Processing, vol.27, no.2, pp. 113-120, April 1979.   DOI
13 S. K. Pal and S. Mitra, "Multilayer perceptron, fuzzy sets, and classification," IEEE Transaction on Neural Networks, vol. 3, no. 5, pp. 683-697, Sep. 1992.   DOI
14 A. Kurematsu, K.Takeda, Y. Sagisaka, S. Katagiri, H. Kuwabara, and K. Shikano, "ATR Japanese speech database as a tool of speech recognition and synthesis," Speech Communication, vol. 9, pp.357-363, 1990.   DOI
15 D. Rumelhart, G. Hinton and R. Williams, "Learning representations by back-propagation errors," Nature, vol. 323, pp. 533-536, Oct. 1986.   DOI