Speech Enhancement Based on Improved Minima Controlled Recursive Averaging Incorporating GSAP

전역 음성 부재 확률 기반의 향상된 최소값 제어 재귀평균기법을 이용한 음성 향상 기법

  • Received : 2011.08.17
  • Accepted : 2011.11.25
  • Published : 2012.01.25

Abstract

In this paper, we propose a novel method to improve the performance of the improved minima controlled recursive averaging (IMCRA). From an examination for various noise environment, it is shown that the IMCRA has a fundamental drawback for the noise power estimate at the offset region of continuity speech signals. Espectially, it is difficult to obtain the robust estimates of the noise power in non-stationary noisy environments that is rapidly changed the spectral characteristics such as babble noise. To overcome the drawback, we apply the global speech absence probability (GSAP) conditioned on both a priori SNR and a posteriori SNR to the speech detection algorithm of IMCRA. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and a composite measure test, we show that the proposed algorithm yields better results compared to the conventional IMCRA-based scheme under various noise environments. In particular, in the case of babble 5 dB, the proposed method produced a remarkable improvement compared to the IMCRA ( PESQ = 0.026, composite measure = 0.029 ).

본 논문에서는 향상된 최소값 제어 재귀 평균 기법 (improved minima controlled recursive averaging, IMCRA) 알고리즘의 잡음 전력 추정성능을 향상 시키기 위한 알고리즘을 제안한다. 기존의 IMCRA은 주파수 특성이 빠르게 변화하는 비정상적인 환경과 낮은 SNR을 갖는 상황에서 잡음 전력 추정에 직접적으로 영향을 미치는 음성 검출기의 성능이 강인하지 못한 단점이 있다. 본 연구에서는 강인한 음성 검출 성능을 위해서 기존 IMCRA의 음성 검출기에 전역 음성 부재 확률을 적용한 음성 향상 기법을 제안한다. 제안된 알고리즘의 성능 평가는 음성의 perceptual evaluation of speech quality (PESQ)와 composite measure를 통한 음질을 평가하였다. 실험 결과 다양한 잡음 환경 (car, white, babble)에서 전역 음성 부재 확률을 적용한 IMCRA의 음성 향상 기법이 향상된 결과를 보여주었다. 특히, 비정상잡음 환경인 babble 5dB에서 PESQ 0.026, composite measure 0.029의 향상된 음질을 나타내었다.

Keywords

References

  1. S. F. Boll. "Suppression of acousitc noise in speech using spectral subtraction," IEEE Transactions on Acoustics, Speech and Siganl Processing, ASSP-27(2), pp.113-120, Apr. 1979.
  2. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transactions on Acoustics, Speech and Signal Processing, pp.113-120, Apr. 1979.
  3. I. Cohen and B. Berdugo, "Speech enhancement for non-stationary noise environment," Signal Processing, pp.2403-2418, Nov. 2001.
  4. G. Doblinger, "Computationally efficient speech enhancement by spectral minima tracking in subbands," Proc. 4th European Conf. Speech, Communication and Technology, EUROSPEECH'95, pp.1513-1516, Sep. 1995.
  5. R. Martin, "Spectral subtraction based on minimum statistics," Proceeding of 7th EUSIPCO'94, Edinburgh, U.K., pp.1182-1185, Sep. 1994.
  6. I. Cohen and B. Berdugo, "Spectral enhancement by tracking speech presence probability in subbands," Proc. IEEE Workshop on Hands Free Speech Communication, HSC'01, Kyoto, Japan, pp.95-98, Apr. 2001.
  7. I. Cohen and B. Berdugo, " Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, pp.12-15, Jan. 2002
  8. I. Cohen, "Noise spectrum estimation in adverse environments : improved minima controlled recursive averaging," IEEE Transactions on Speech and Audio Processing, pp.466-475, Sep. 2003.
  9. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans Acoustic, Speech and Audio Processing, pp.504-512, Jul. 2001
  10. S. Rangachari, P. C. Loizou and Y. Hu, "A noise estimation algorithm with rapid adaptation for highly nonstationary environments," IEEE Conf. Acoustic, Speech Signal Processing, pp.305-308. May 2004.
  11. N. S. Kim and J. H. Chang, "Spectral enhancement based on global soft decision," IEEE Siganl Processing Letters, pp.108-110, May. 2000.
  12. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Siganl Processing, ASSP-32(6), pp.1109-1121, Dec. 1984.
  13. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Siganl Processing, ASSP-32(2), pp.443-445, Apr. 1985.
  14. Y. Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Transactions on Audio, Speech and Language Processing, pp.229-238 Jan. 2008.