Speech Enhancement Based on Modified IMCRA Using Spectral Minima Tracking with Weighted Subband Selection

서브밴드 가중치를 적용한 스펙트럼 최소값 추적을 이용하는 수정된 IMCRA 기반의 음성 향상 기법

  • Park, Yun-Sik (Department of Electronic Engineering, Inha University) ;
  • Park, Gyu-Seok (Department of Electronic Engineering, Inha University) ;
  • Lee, Sang-Min (Department of Electronic Engineering, Inha University)
  • Received : 2011.09.26
  • Accepted : 2012.02.09
  • Published : 2012.05.25

Abstract

In this paper, we propose a novel approach to noise power estimation for speech enhancement in noisy environments. The method based on IMCRA (improved minima controlled recursive averaging) which is widely used in speech enhancement utilizes a rough VAD (voice activity detection) algorithm which excludes speech components during speech periods in order to improves the performance of the noise power estimation by reducing the speech distortion caused by the conventional algorithm based on the minimum power spectrum derived from the noisy speech. However, since the VAD algorithm is not sufficient to distinguish speech from noise at non-stationary noise and low SNRs (signal-to-noise ratios), the speech distortion resulted from the minimum tracking during speech periods still remained. In the proposed method, minimum power estimate obtained by IMCRA is modified by SMT (spectral minima tracking) to reduce the speech distortion derived from the bias of the estimated minimum power. In addition, in order to effectively estimate minimum power by considering the distribution characteristic of the speech and noise spectrum, the presented method combines the minimum estimates provided by IMCRA and SMT depending on the weighting factor based on the subband. Performance of the proposed algorithm is evaluated by subjective and objective quality tests under various environments and better results compared with the conventional method are obtained.

본 논문에서는 잡음환경에서 음성 향상 (speech enhancement)을 위한 새로운 잡음전력 추정 방법을 제안한다. 음성 향상 알고리즘에 널리 적용되고 있는 IMCRA (improved minima controlled recursive averaging) 기법은 오염된 음성신호로부터 추정된 최소 전력 스펙트럼에 기반하여 잡음전력을 추정하는 기존의 방법을 개선하기 위해 간단한 음성 검출 알고리즘을 이용하여 대략적으로 음성 성분이 제거된 전력 스펙트럼에서 최소값을 추정함으로써 음성구간에서 발생할 수 있는 음성왜곡 문제점을 개선하였다. 하지만 비정상 잡음이나 신호 대 잡음 비 (SNR signal-to-noise ratio)가 낮은 환경에서는 음성 검출 성능이 저하되어 음성구간에서 음성왜곡이 발생되는 기존의 문제점이 여전히 발생된다. 따라서 제안된 방법에서는 향상된 잡음전력 추정을 위하여 기존의 IMCRA에서 추정된 최소 전력 스펙트럼에 대하여 스펙트럼 최소값 추적 (SMT, spectral minima tracking) 기법을 적용하고 IMCRA에 의한 최소값과 SMT에 의해 추정된 최소값을 서브밴드 (subband)에 따라 가중치를 적용하여 결합한다. 제안된 알고리즘은 기존의 방법과 주관적 및 객관적 음질평가 테스트를 통해 비교 평가한 결과 다양한 배경잡음 환경에서 향상된 성능을 보였다.

Keywords

References

  1. G. Doblinger, "Computationally efficient speech enhancement by spectral minima tracking in subbands," in Proc. EUROSPEECH, vol. 2, pp. 1513-1516, 1995.
  2. R. Martin, "Spectral subtraction based on minimum statistics," in Proc. Eur. Signal Processing Conf., pp. 1182-1185, 1994.
  3. R. Martin, "Noise power spectral density estimation based. on optimal smoothing and minimum statistics," IEEE. Trans. on Speech and Audio Processing, vol. 9, no. 5, pp. 504-512, July 2001. https://doi.org/10.1109/89.928915
  4. I. Cohen, B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, vol. 9, no. 1, pp. 12-15, Jan. 2002. https://doi.org/10.1109/97.988717
  5. I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controller recursive averaging," IEEE Trans. Speech Audio Processing, vol. 11, no. 5, pp. 466-475, Sep. 2003. https://doi.org/10.1109/TSA.2003.811544
  6. Y. -S. Park and J. -H. Chang, "A novel approach to a robust a priori SNR estimator in speech enhancement," IEICE Trans. on Communications, vol. E90-B, no.8, pp 2182-2185 Aug. 2007. https://doi.org/10.1093/ietcom/e90-b.8.2182
  7. TIA/EIA/IS-127, Enhanced variable rate codec, speech service option 3 for wideband spread spectrum digital systems, 1996.
  8. G. D. Wu and C. T. Lin, "Word boundary detection with mel-scale frequency bank in noise environment," IEEE Trans. Speech Audio Process., vol. 8, no. 3, pp. 541-554, May 2000. https://doi.org/10.1109/89.861373
  9. Y. Tian, J. Wu, Z. Wang, and D. Lu, "Robust noisy speech recognition with adaptive frequency bank selection," in Proc. ICMI, pp.75-80, 2002.
  10. B.F. Wu, K.C. Wang, "Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments," IEEE Trans. Speech Audio Process. vol. 13, no. 5, pp. 762-775, Sept. 2005.
  11. Yi Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Trans. ASLP, vol. 16, pp. 229-238, Jan. 2008.