Browse > Article

Cepstral Distance and Log-Energy Based Silence Feature Normalization for Robust Speech Recognition  

Shen, Guang-Hu (영남대학교 정보통신공학과)
Chung, Hyun-Yeol (영남대학교 정보통신공학과)
Abstract
The difference between training and test environments is one of the major performance degradation factors in noisy speech recognition and many silence feature normalization methods were proposed to solve this inconsistency. Conventional silence feature normalization method represents higher classification performance in higher SNR, but it has a problem of performance degradation in low SNR due to the low accuracy of speech/silence classification. On the other hand, cepstral distance represents well the characteristic distribution of speech/silence (or noise) in low SNR. In this paper, we propose a Cepstral distance and Log-energy based Silence Feature Normalization (CLSFN) method which uses both log-energy and cepstral euclidean distance to classify speech/silence for better performance. Because the proposed method reflects both the merit of log energy being less affected with noise in high SNR and the merit of cepstral distance having high discrimination accuracy for speech/silence classification in low SNR, the classification accuracy will be considered to be improved. The experimental results showed that our proposed CLSFN presented the improved recognition performances comparing with the conventional SFN-I/II and CSFN methods in all kinds of noisy environments.
Keywords
Speech recognition; Feature enhancement; Silence feature normalization; Cepstral distance;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K.S. Yao, E. Visser, O.W. Kwon and T.W. Lee, "A Speech Processing Front-End with Eigenspace Normalization for Robust Speech Recognition in Noisy Automobile Environments," Proc. Eurospeech, pp. 9-12, Sep. 2003.
2 C.-F. Tai and J.-W. Hung, "Silence Energy Normalization for Robust Speech Recognition in Additive Noise Environments," Proc. ICSLP, pp. 2558-2561, Sep. 2006.
3 C.-C. Wang, C.-A. Pan and J.-W. Hung, "Silence Feature Normalization for Robust Speech Recognition in Additive Noise Environments," Proc. ICSLP, pp. 1028-1031, Sep. 2008.
4 신광호, 구엔딘쿵, 정현열, "강인한 음성인식을 위한 켑스트럼 유클리디언 거리 기반 묵음 특징 정규화," 제26회 음성통신 및 신호처리 학술대회 논문집, vol. 26, no. 1, pp. 265-268, 2009. 8.
5 S.E. Bou-Ghazale and K. Assaleh, "A Robust Endpoint Detection of Speech for Noisy Environments with Application to Automatic Speech Recognition," Proc. ICASSP, 2002.
6 H.-G Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions," ISCA ITRW ASR 2000, France, Sep. 2000.