Browse > Article

Cepstrum PDF Normalization Method for Speech Recognition in Noise Environment  

Suk Yong Ho (㈜엠큐브웍스)
Lee Hwang-Soo (한국과학기술원 전자전산학과)
Choi Seung Ho (서울산업대학교 전자정보공학과)
Abstract
In this paper, we Propose a novel cepstrum normalization method which normalizes the probability density function (pdf) of cepstrum for robust speech recognition in additive noise environments. While the conventional methods normalize the first- and/or second-order statistics such as the mean and/or variance of the cepstrum. the proposed method fully normalizes the statistics of cepstrum by making the pdfs of clean and noisy cepstrum identical to each other For the target Pdf, the generalized Gaussian distribution is selected to consider various densities. In recognition phase, we devise a table lookup method to save computational costs. From the speaker-independent isolated-word recognition experiments, we show that the Proposed method gives improved Performance compared with that of the conventional methods, especially in heavy noise environments.
Keywords
Speech Recognition; Cepstrum; Normalization; Pdf;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Y. Tohkura, 'A weighted cepstral distance measure for speech recognition,' IEEE Trans. on Acoust. Speech and Signal Process., 35 (10), 1414-1422, Oct. 1987   DOI
2 김우일, 고한석, '시변 잡음에 대처하기 위한 다중 모델을 이용한 PCMM기반 특징 보상 기법,' 한국음향학회지, 23 (6), 473-480, Aug., 2004
3 H. A. David, Order statistics, (John Wiley & Sons, NY, 1981)
4 A. Acero, Acoustical and environmental robustness in automatic speech recognition, (Kluwer Academic Polishers, Boston, MA, 1993)
5 F. N. David and N. L. Johnson, 'Statistical treatment of censored data, Part I. fundamental formulae,' Biometrika, 41, 228-240, 1956
6 M. J. F. Gales and S. J. Young, 'Robust continuous speech recognition using parallel model combination,' IEEE Trans. on Speech and Audio Process., 4 (5), 352-259, Sep. 1996   DOI   ScienceOn
7 J. C. Junqua and J. P. Haton, Robustness in Automatic Speech Recognition, (Kluwer Academic Publishers, 1996)
8 O. Viikki, D. Bye, and K. Laurila, 'A recursive feature vector normalization approach for robust speech recognition in noise,' in Proc. ICASSP, 733-736, 1998
9 M. R. Schroeder, 'Direct (nonrecursive) relations between cepstrum and predictor coefficients,' IEEE Trans. on Acoust. Speech and Signal Process., 29 (2), 297-301, Apr. 1981   DOI
10 P. J. Moreno, B. Raj, E. Gouvea, and R. M. Stern, 'Multivariate-Gaussian-based cepstral normalization for robust speech recognition,' in Proc. ICASSP, 137-140, May 1995
11 S. A. Kassam, Signal detection in non-Gaussian noise, (Springer-Verlag, NY, 1988)
12 J. C. Junqua and H. Wakita, 'A comparative study of cepstral lifters and distance measures for all pole models of speech in noise,' in Proc. of ICASSP, 476-479, May 1989