DOI QR코드

DOI QR Code

Non-Stationary/Mixed Noise Estimation Algorithm Based on Minimum Statistics and Codebook Driven Short-Term Predictor Parameter Estimation

최소 통계법과 Short-Term 예측계수 코드북을 이용한 Non-Stationary/Mixed 배경잡음 추정 기법

  • 이명석 (세종대학교 정보통신공학과) ;
  • 노명훈 (세종대학교 정보통신공학과) ;
  • 박성주 (전자부품연구원 디지털미디어연구센터) ;
  • 이석필 (전자부품연구원 디지털미디어연구센터) ;
  • 김무영 (세종대학교 정보통신공학과)
  • Received : 2010.02.08
  • Accepted : 2010.04.10
  • Published : 2010.04.30

Abstract

In this work, the minimum statistics (MS) algorithm is combined with the codebook driven short-term predictor parameter estimation (CDSTP) to design a speech enhancement algorithm that is robust against various background noise environments. The MS algorithm functions well for the stationary noise but relatively not for the non-stationary noise. The CDSTP works efficiently for the non-stationary noise, but not for the noise that was not considered in the training stage. Thus, we propose to combine CDSTP and MS. Compared with the single use of MS and CDSTP, the proposed method produces better perceptual evaluation of speech quality (PESQ) score, and especially works excellent for the mixed background noise between stationary and non-stationary noises.

본 논문에서는 배경잡음에 강인한 잡음제거 알고리즘 설계를 위해서 minimum statistics (MS) 기법을 codebook driven short-term predictor parameter estimation (CDSTP) 기법에 접목하는 방법을 제안한다. MS는 stationary 배경잡음에는 강인하지만, non-stationary 배경잡음에는 상대적으로 취약하다. CDSTP는 non-stationary 배경잡음에 강인한 특성을 보이지만, 코드북에 없는 배경잡음 환경에는 취약하다. 따라서 non-stationary 배경잡음에 강인한 CDSTP 방법과 별도의 코드북 학습 과정이 필요 없는 MS를 결합해서 다양한 배경잡음에 강인한 알고리즘을 제안한다. 제안방법은 MS나 CDSTP 방법에 비해서 전체적으로 향상된 perceptual evaluation of speech quality (PESQ) 성능을 나타냈으며, 특히 stationary 배경잡음과 non-stationary 배경잡음이 섞여 있는 mixed 배경잡음 환경에서 강인한 특성을 보였다.

Keywords

References

  1. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoustics, Speech, Signal Processing, vol. 27, no. 2, pp. 113-120, 1979. https://doi.org/10.1109/TASSP.1979.1163209
  2. L. Singh, and S. Sridharan, "Speech enhancement using critical band spectral subtraction," in Proc, Intern. Conf. Spoken Lang, Processing, pp. 2827-2830, 1998.
  3. S. Kamath, and P. Loizou, "A multi-band spectral subtraction method for enhancing speech corrupted by colored noise," Proc. IEEE Int. Conf. Acousf. Speech Signal Processing, 2002.
  4. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans. Speech Audio processing. vol. 9, no. 5, pp. 504-512, 2001. https://doi.org/10.1109/89.928915
  5. 박윤식, 장준혁, "강인한 음성향상을 위한 Minimum Statistics와 Soft Decision의 확률적 결합의 새로운 잡음전력 추정기법," 한국음향학회지, 26권, 4호, 153-158쪽, 2007.
  6. S. Srinivasan, J. Samuelsson, and W. B. Kleijn, "Codebook driven short-term predictor parameter estimation for speech enhancement," IEEE Trans. Speech Audio processing, vol. 14, issue 1. pp.163-176, 2006. https://doi.org/10.1109/TSA.2005.854113
  7. M. Kuropatwinski and W. B. Kleijn, "Estimation of the excitation variances of speech and noise AR-models for enhanced speech coding," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, col. 1, pp. 669-672, 2001
  8. R. M. Gray, A. Buzo, A. H. G Jr. and Y. Matsuyama, "Distortion measures for speech processing," IEEE Trans. Acoustics, Speech, Signal Processing, vol. 28, no. 4, pp. 367-376, 1980 https://doi.org/10.1109/TASSP.1980.1163421
  9. T. Sreenicas and P. Kirnapure, "Codebook constrained Wiener filtering for speech enhancement," IEEE Trans. Speech Audio Processing, col. 4, no. 5, pp. 383-389, 1996. https://doi.org/10.1109/89.536932
  10. ETSI ES 202 050, Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Advanced ront-end feature extraction algorithm; Compression algorithms, 2007.
  11. A. Rix, J. Beerends, M. Hollier, and A. Hekstra, "Perceptual evaluation of speech quality (PESQ) - A new method for speech quality assessment of telephone networks and codecs," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, col. 2, pp. 749-752, 2001