Browse > Article
http://dx.doi.org/10.6109/jkiice.2014.18.10.2367

Speech Enhancement Based on Feature Compensation for Independently Applying to Different Types of Speech Recognition Systems  

Kim, Wooil (School of Computer Science & Engineering, Incheon National University)
Abstract
This paper proposes a speech enhancement method which can be independently applied to different types of speech recognition systems. Feature compensation methods are well known to be effective as a front-end algorithm for robust speech recognition in noisy environments. The feature types and speech model employed by the feature compensation methods should be matched with ones of the speech recognition system for their effectiveness. However, they cannot be successfully employed by the speech recognition with "unknown" specification, such as a commercialized speech recognition engine. In this paper, a speech enhancement method is proposed, which is based on the PCGMM-based feature compensation method. The experimental results show that the proposed method significantly outperforms the conventional front-end algorithms for unknown speech recognition over various background noise conditions.
Keywords
Speech enhancement; Feature compensation; Speech recognition; Noisy environment; Unknown system;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. F. Boll, "Suppression of Acoustic Noise in Speech Using Spectral Subtraction," IEEE Trans. on Acoustics, Speech and Signal Proc., vol.27, pp.113-120, 1979.   DOI
2 Y. Ephraim and D. Malah, "Speech Enhancement Using Minimum Mean Square Error Short Time Spectral Amplitude Estimator," IEEE Trans. on Acoustics, Speech and Signal Proc., vol.32, no.6, pp.1109-1121, 1984.   DOI
3 J. L. Gauvain and C. H. Lee, "Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains," IEEE Trans. on Speech and Audio Proc., vol.2, no.2, pp.291-298, 1994.   DOI   ScienceOn
4 J. H. L. Hansen and M. Clements, "Constrained Iterative Speech Enhancement with Application to Speech Recognition," IEEE Trans. on Signal Proc., vol.39, no.4, pp.795-805, 1991.   DOI
5 P. J. Moreno, B. Raj, and R. M. Stern, "Data-driven Environmental Compensation for Speech Recognition: A Unified Approach," Speech Communication, 24(4), pp.267-285, 1998.   DOI   ScienceOn
6 W. Kim and J. H. L. Hansen, "Feature Compensation in the Cepstral Domain Employing Model Combination," Speech Communication, 51(2), pp.83-96, 2009.   DOI   ScienceOn
7 C. J. Leggetter and P. C. Woodland, "Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density HMMs," Computer Speech and Language, 9, pp.171-185, 1995.   DOI   ScienceOn
8 M. J. F. Gales and S. J. Young, "Robust Continuous Speech Recognition Using Parallel Model Combination," IEEE Trans. on Speech and Audio Proc., vol.4, no.5, pp.352-359, 1996.   DOI   ScienceOn
9 R. Martin, "Spectral Subtraction Based on Minimum Statistics," EUSIPCO-94, pp.1182-1185, Sep. 1994.
10 ETSI Standard Document, ETSI ES 202 050 v1.1.1 (2002-10), 2002.
11 H. G. Hirsch & D. Pearce, "The AURORA Experimental Framework for the Performance Evaluations of Speech Recognition Systems under Noisy Conditions", ISCA ITRW ASR2000, Sep. 2000.
12 ETSI standard document, ETSI ES 201 108 v1.1.2 (2000-04), Feb. 2000.
13 http://htk.eng.cam.ac.uk