잡음에 강인한 음성인식을 위한 Generalized Gamma 분포기반과 Spectral Gain Floor를 결합한 음성향상기법

Speech Estimators Based on Generalized Gamma Distribution and Spectral Gain Floor Applied to an Automatic Speech Recognition

  • 김형국 (광운대학교 전파공학과) ;
  • 신동 (광운대학교 전파공학과) ;
  • 이진호 (광운대학교 전파공학과)
  • 발행 : 2009.06.30

초록

본 논문은 잡음에 강인한 음성인식 성능을 획득하기 위해 generalized Gamma 분포기반의 음성향상 기법을 제안한다. 우수한 음성향상을 위해서 제안된 방식에서는 generalized Gamma분포와 spectral gain floor를 이용한 음성추적 기법에 스펙트럼 최소잡음성분에 의한 희귀적인 평균 스펙트럼 값으로부터 유도되는 잡음추정을 결합하여 음질을 향상시켜 음성인식에 적용하였다. Spectral component, spectral amplitude 그리고 log spectral amplitude에 기반하여 제안된 음성향상 기법을 잡음환경에서의 음성인식에 적용하여 그 성능을 측정하였다.

This paper presents a speech enhancement technique based on generalized Gamma distribution in order to obtain robust speech recognition performance. For robust speech enhancement, the noise estimation based on a spectral noise floor controled recursive averaging spectral values is applied to speech estimation under the generalized Gamma distribution and spectral gain floor. The proposed speech enhancement technique is based on spectral component, spectral amplitude, and log spectral amplitude. The performance of three different methods is measured by recognition accuracy of automatic speech recognition (ASR).

키워드

참고문헌

  1. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech and Audio Proc., vol. 9, no. 5, pp. 504-512, July 2001. https://doi.org/10.1109/89.928915
  2. I. Cohen and B. Berdugo, "Speech enhancement for non-stationary environments," Signal Processing, Elsevier, vol. 81, no. 11, pp. 2403-2418, Nov. 2001. https://doi.org/10.1016/S0165-1684(01)00128-1
  3. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Trans. Acoustics, Speech, and Signal Proc., vol. 33, no. 2, pp. 443-445, Dec. 1985. https://doi.org/10.1109/TASSP.1985.1164550
  4. R. C. Hendriks, J. S. Erkelens, J. Jensen, and R. Heusdens, "Minimum mean-sqaure error amplitude estimators for speech enhancement under the generalized Gamma distribution," Proc. International Workshop on Acoustic Echo and Noise Control(IWAENC), vol. 10, pp. 1-4, Sept. 2006.