[KSCI] Korea Science Citation Index Service

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC

Lee, Chang-Young (동서대학교 정보시스템공학부)

Publication Information

The Journal of the Korea institute of electronic communication sciences / v.5, no.4, 2010 , pp. 363-371 More about this Journal

Abstract

In an effort to enhance the quality of feature vector classification and thereby reduce the recognition error rate for the speaker-independent speech recognition, we study the effect of spectral tilt on the Fourier magnitude spectrum en route to the extraction of MFCC. The effect of FIR filtering on the speech signal on the speech recognition is also investigated in parallel. Evaluation of the proposed methods are performed by two independent ways of the Fisher discriminant objective function and speech recognition test by hidden Markov model with fuzzy vector quantization. From the experiments, the recognition error rate is found to show about 10% relative improvements over the conventional method by an appropriate choice of the tilt factor.

Keywords

Speech Recognition; Spectral Tilt; MFCC; HMM; FVQ;

Citations & Related Records

Reference

1	G. Kaplan, "Words Into Action I," IEEE Spectrum, vol. 17, pp. 22-26, 1980.
2	K. H. Davis, R. Biddulph, & S. Balashek, "Automatic Recognition of Spoken Digits," J. Acoust. Soc. Am., vol. 24, no. 6, pp. 637-642, 1952. DOI
3	J. W. Picone, "Signal Modeling Techniques in Speech Recognition." Proc. IEEE, vol. 81, no. 9, pp. 1215-1247, 1993. DOI ScienceOn
4	J.-C. Wang, J.-F. Wang, & Y. Weng, "Chip Design of MFCC Extraction For Speech Recognition." The VLSI Journal, vol. 32, pp. 111-131, 2002. DOI ScienceOn
5	E. Zwicker & E. Terhardt, "Analytical Expressions for Critical Band Rate and Critical Bandwidth As a Function of Frequency." J. Acoust. Soc. America, vol. 68, no. 5, pp. 1523-1525, 1980. DOI ScienceOn
6	W. Han, C. Chan, C. Choy, & K. Pun, "An Efficient MFCC Extraction Method in Speech Recognition." 2006 IEEE International Symposium on Circuits and Systems, pp. 145-148, 2006.
7	Wikipedia Encyclopedia on Pre-emphasis.
8	R. A. Fisher, "The Use of Multiple Measurements in Taxonomic Problems." Annals of Eugenics, vol. 7, pp. 179-188, 1936. DOI
9	L. Rabiner and B. Juang, "Fundamentals of Speech Recognition," Prentice Hall, New Jersey, pp. 112-113, 1993.
10	J. Hung, "Optimization of Filter-Bank to Improve the Extraction of MFCC Features in Speech Recognition", Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 675-678, 2004.
11	A. Martin, D. Charlet, & A. Mauuary, "Robust Speech/Non-Speech Detection Using LDA Applied to MFCC", 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing," vol. 1, pp. 237-240, 2001.
12	R. Hecht-Nielsen, "Neurocomputing," Reading, Massachusetts, Addison-Wesley, 1990.
13	M. Dehghan, K. Faez, M. Ahmadi, & M. Shridhar, "Unconstrained Farsi Handwritten Word Recognition Using Fuzzy Vector Quantization and Hidden Markov Models," Pattern Recognition Letters, vol. 22, pp. 209-214, 2001. DOI ScienceOn
14	S. E. Levinson, L. R. Rabiner, & M. M. Sondhi, "An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech Recognition," Bell Systems Tech. J., vol. 62, no. 4, pp. 1035-1074, 1983. DOI

KSCI

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC FIR 필터링과 스펙트럼 기울이기가 MFCC를 사용하는 음성인식에 미치는 효과

The Effect of FIR Filtering and Spectral Tilt on Speech Recognition with MFCC