DOI QR코드

DOI QR Code

An Improved Speech Absence Probability Estimation based on Environmental Noise Classification

환경잡음분류 기반의 향상된 음성부재확률 추정

  • Received : 2011.06.25
  • Accepted : 2011.08.29
  • Published : 2011.10.31

Abstract

In this paper, we propose a improved speech absence probability estimation algorithm by applying environmental noise classification for speech enhancement. The previous speech absence probability required to seek a priori probability of speech absence was derived by applying microphone input signal and the noise signal based on the estimated value of a posteriori SNR threshold. In this paper, the proposed algorithm estimates the speech absence probability using noise classification algorithm which is based on Gaussian mixture model in order to apply the optimal parameter each noise types, unlike the conventional fixed threshold and smoothing parameter. Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 PESQ (perceptual evaluation of speech quality) and composite measure under various noise environments. It is verified that the proposed algorithm yields better results compared to the conventional speech absence probability estimation algorithm.

본 논문에서는 음성향상을 위하여 환경잡음분류를 적용한 향상된 음성부재확률 추정방법을 제안한다. 기존의 음성부재확률 추정방법에서는 마이크로폰 입력신호와 추정된 잡음신호 기반의 a posteriori SNR값에 문턱값을 적용하여 음성부재확률을 구하는데 필요한 음성부재의 a priori 확률을 도출하였다. 본 논문에서 제안된 알고리즘은 보다 효과적인 음성부재확률 추정을 위하여 고정된 문턱값과 스무딩 (smoothing)파라미터를 사용하는 기존의 방법과는 달리 잡음분류 알고리즘인 가우시안 혼합 모델 (Gaussian mixture model)을 사용하여 잡음마다 최적화된 파라미터를 적용한다. 제안된 음성 향상 기법은 ITU-T P.862 PESQ (perceptual evaluation of speech quality)와 composite measure를 이용하여 다양한 환경에서 평가하였으며, 제안된 알고리즘이 기존의 음성부재확률 추정방법보다 향상된 결과를 보였다.

Keywords

References

  1. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator." IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, 1984.
  2. Y. Epharim and D. Malah, "Speech enhancement using a minimum mean-square error log-spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 2, pp. 443-445, 1985.
  3. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-27, no. 2, pp. 113-120, 1979.
  4. J. Sohn, N. S. Kim and W. Sung, "A statistical model-based voice activity detection" IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, 1999.
  5. R. Martin. "Spectral subtraction based on minimum statistics," in Proc., vol. ASSP-27, no. 2, pp. 113-120, 1979.
  6. I. Cohen and B. Berdugo, "Speech enhancement for nonstationary noise environments," Signal Processing, vol. 81, pp. 2403-2418, 2001. https://doi.org/10.1016/S0165-1684(01)00128-1
  7. G. Doblinger, "Computationally efficient speech enhancement by spectral minima tracking in subbands," in Proc. 4th EUROSPEECH'95, Madrid, Spain, pp. 1513-1516, 1995.
  8. J. Meyer, K. U. Simmer and K. D. Kammeter, "Comparison of one-and two channel noise-estimation techniques," in Proc. 5th IWAENC'97, London, U.K, pp. 137-145, 1997.
  9. R. J. McAualy and M. L. Malpass, "Speech enhancement using a soft-decision noise suppression filter," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, pp. 137-145, 1980.
  10. N. S. Kim and J. H. Chang, "Spectral enhancement based on global soft decision," IEEE Signal Processing Letters, pp. 108-110, 2000.
  11. D. Malah, R. Cpx, and A. Accardi, "Tracking speech presence uncertainty to improve speech enhancement in non-stationary noise environments," Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 789-792, 1999.
  12. G. Xuan, W.Zhang, and P. Chai, "EM algorithm of Gaussian mixture model and hidden Markov model," Proc. IEEE International Conference on Image Processing, vol. 1, pp. 145-148, 2001.
  13. D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Processing, vol. 10, pp. 19-41, 2000. https://doi.org/10.1006/dspr.1999.0361
  14. ITU-T P.862, Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs, Feb. 2001.
  15. Y. Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement, " IEEE Transactions on Audio, Speech and Language Processing, pp. 229-238, 2008.

Cited by

  1. Restoration for Speech Records Managed by the National Archives of Korea vol.32, pp.3, 2013, https://doi.org/10.7776/ASK.2013.32.3.269