Speech Enhancement based on Minima Controlled Recursive Averaging Technique Incorporating Second-order Conditional Maximum a posteriori Criterion

2차 조건 사후 최대 확률 기반 최소값 제어 재귀평균기법을 이용한 음성향상

  • Kum, Jong-Mo (Department of Electronics Engineering, Inha University) ;
  • Chang, Joon-Hyuk (Department of Electronics Engineering, Inha University)
  • 금종모 (인하대학교 전자공학부) ;
  • 장준혁 (인하대학교 전자공학부)
  • Published : 2009.07.25

Abstract

In this paper, we propose a novel approach to improve the performance of minima controlled recursive averaging (MCRA) which is based on the second-order conditional maximum a posteriori (CMAP). From an investigation of the MCRA scheme, it is discovered that the MCRA method cannot take full consideration of the inter-frame correlation of voice activity since the noise power estimate is adjusted by the speech presence probability depending on an observation of the current frame. To avoid this phenomenon, the proposed MCRA approach incorporates the second-order CMAP criterion in which the noise power estimate is obtained using the speech presence probability conditioned on both the current observation and the speech activity decisions in the previous two frames. Experimental results show that the proposed MCRA technique based on second-order conditional MAP yields better results compared to the conventional MCRA method.

본 논문에서는 기존의 Minima Controlled Recursive Averaging (MCRA)에 2차 조건 사후 최대 확률기법을 적용한 음성 향상 기법을 제안한다. 기존의 MCRA 방법은 현제 프레임의 음성 신호 존재 확률로 잡음 추정을 조정하기 때문에 음성 활동의 프레임간의 상호 연관성을 배제 하였다. 본 논문에서 제안한 알고리즘은 직전 2 프레임에서의 음성의 존재와 부재에 대한 조건을 부여해 주어 현제 프레임의 음성 신호 존재 확률을 수정하는 음성향상 기법을 적용한다. 제안된 2차 조건 사후 최대 확률기법을 적용한 MCRA 방법이 기존의 MCRA 방법보다 향상된 음성향상 결과를 나타내었다.

Keywords

References

  1. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984
  2. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-32, no. 2, pp. 443-445, Apr. 1985
  3. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979
  4. R. Martin, "Spectral subtraction based on minimum statistics," in Proc. EUSIPCO, Edinburgh, U.K., pp. 1182-1185, Sept. 1994
  5. G. Doblinger, "Computationally efcient speech enhancement by spectral minima tracking in subbands," in Proc. EUROSPEECH, Madrid, Spain, pp. 1513-1516, Sept. 1995
  6. J. Meyer, K. U. Simmer and K. D. Kammeyer, “Comparison of one-and two-channel noise-estimation techniques,” in Proc. IWAENC, London, U.K., pp. 137-145, Sept. 1997
  7. I. Cohen and B. Berdugo, "Speech enhancement for non-stationary noise environments," Signal Processing, vol. 81, pp. 2403-2418, Nov. 2001 https://doi.org/10.1016/S0165-1684(01)00128-1
  8. N. S. Kim and J.-H. Chang, "Spectral enhancement based on global soft decision," IEEE Signal Processing Letters, vol. 7, no. 5, pp. 108-110, May 2000. https://doi.org/10.1109/97.841154
  9. J. W. Shin, H. J. Kwon, S. H. Jin and N. S. Kim, "Voice activity detection based on conditional MAP criterion," IEEE Signal Processing Letters, vol. 15, pp. 257-260, Feb. 2008 https://doi.org/10.1109/LSP.2008.917027
  10. I. Cohen and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, vol. 9, no. 1, pp. 12-15, Jan. 2002 https://doi.org/10.1109/97.988717
  11. I. Cohen, "Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging," IEEE Transactions on Speech and Audio Processing, vol. 11, no. 5, pp. 466-475, Sept. 2003 https://doi.org/10.1109/TSA.2003.811544
  12. V. Stouten, H. V. hamme, P. Wambacq, "Application of minimum statistics and minima controlled recursive averaging methods to estimate a cepstral noise model for robust ASR," in Proc. ICASSP, Toulouse, France, pp. 765-768, May. 2006
  13. N. Fan, J. Rosca, R. Balan, "Speech noise estimation using enhanced minima controlled recursive averaging," in Proc. ICASSP, Honolulu, Hawaii, U.S.A., pp. 581-584, Apr. 2007
  14. J. Sohn, N. S. Kim andW. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, no 1, pp. 1-3, Jan. 1999 https://doi.org/10.1109/97.736233
  15. ITU-T P.862, Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs, 2001