Nose Estimation and Suppression methods based on Normalized Variance in Time-Frequency for Speech Enhancement

음성강화를 위한 시간 및 주파수 도메인의 분산정규화 기반 잡음예측 및 저감방법

  • 이수정 (성균관대학교, BK21 정보기술사업단) ;
  • 김순협 (광운대학교, 컴퓨터공학과)
  • Published : 2009.01.25

Abstract

Noise estimation and suppression are a crucial factor of many speech communication and recognition systems. In this paper, proposed algorithm is based on the ratio of variance normalized of noisy power spectrum in time-frequency domain. Our proposed algorithm tracks the threshold and controls the trade-off between residual noise and distortion. This algorithm is evaluated by the ITU-T P.835 signal distortion (SIG) and segment signal to noise ratio (SNR), and is superior to the conventional methods.

잡음예측 및 저감방법은 음성통신과 인식분야의 중요한 핵심기술이다. 본 논문에서는 다양한 잡음환경에 적용할 수 있는 새로운 잡음예측 및 저감 방법을 제안한다. 제안된 알고리즘은 시간 및 주파수영역의 noisy power spectrum 의 분산과 그 값의 정규화 ratio를 기반으로 한다. 제안한 방법은 다양한 잡음환경에서 잘 동작 할 수 있도록 적응추적 임계값을 사용하며, 이 임계값은 음성왜곡과 잔여잡음 사이의 trade-off를 제어한다. 새로운 알고리즘의 성능은 다양한 잡음환경에서 ITU-T P.835(SIG) and segment (SNR) 의해 평가하여 기존의 방법에 비해 향상된 결과를 나타냈다.

Keywords

References

  1. M. Bhatnagar, "A modified spectral subtraction method combined with perceptual weighting for speech enhancement," Master's thesis, University of Texas at Dallas, pp.1-10, 2003
  2. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust. Speech Signal Processing, 27, (2), pp. 113-120, 1979 https://doi.org/10.1109/TASSP.1979.1163209
  3. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square errror short-time spectral amplitude estimator," IEEE Trans. Acoust. Speech Signal Processing, 32(6), pp. 1109-1121, 1984 https://doi.org/10.1109/TASSP.1984.1164453
  4. O. Cappe, "Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor," IEEE Trans. Speech Audio Processing, 2 (2), pp. 346-349, 1994
  5. Y. Hu, "Subspace and multitaper methods for speech enhancement," Ph.D. dissertation. University of Texas at Dallas, pp. 1-15, 2003
  6. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech Audio Processing, 9 (5), pp. 504-512, 2001 https://doi.org/10.1109/89.928915
  7. Y. Hu and P. Loizou, "Speech enhancement based on wavelet thresholding the multitaper spectrum," IEEE Trans. Speech Audio Processing, 12 (1), pp. 59-67, 2003
  8. I. Cohen, "Noise spectrum in adverse environments: improved minima controlled recursive averaging," IEEE Trans. Speech Audio Processing, 11(5), pp. 466-475, 2003 https://doi.org/10.1109/TSA.2003.811544
  9. I. Cohen, "Speech enhancement using a noncausal a priori SNR estimator," IEEE Signal Processing Letters, 11 (9), pp. 725-728, 2004 https://doi.org/10.1109/LSP.2004.833478
  10. R. Sundarrajan and C. L. Philipos, "A noise-estimation algorithm for highly nonstationary noisy environments," Speech Communication, 48, pp. 220-231, 2006. https://doi.org/10.1016/j.specom.2005.08.005
  11. ITU-T, "Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm", ITU-T Recommendation P.835, 2003
  12. C. L. Philipos, "Speech Enhancement (Theory and Practice," 1st edition. CRC Press, Boca Raton, FL, 2007