DOI QR코드

DOI QR Code

Single-Channel Non-Causal Speech Enhancement to Suppress Reverberation and Background Noise

  • Song, Myung-Suk (School of Electrical and Electronic Engineering, Yonsei University) ;
  • Kang, Hong-Goo (School of Electrical and Electronic Engineering, Yonsei University)
  • 투고 : 2012.08.14
  • 심사 : 2012.10.22
  • 발행 : 2012.11.30

초록

This paper proposes a speech enhancement algorithm to improve the speech intelligibility by suppressing both reverberation and background noise. The algorithm adopts a non-causal single-channel minimum variance distortionless response (MVDR) filter to exploit an additional information that is included in the noisy-reverberant signals in subsequent frames. The noisy-reverberant signals are decomposed into the parts of the desired signal and the interference that is not correlated to the desired signal. Then, the filter equation is derived based on the MVDR criterion to minimize the residual interference without bringing speech distortion. The estimation of the correlation parameter, which plays an important role to determine the overall performance of the system, is mathematically derived based on the general statistical reverberation model. Furthermore, the practical implementation methods to estimate sub-parameters required to estimate the correlation parameter are developed. The efficiency of the proposed enhancement algorithm is verified by performance evaluation. From the results, the proposed algorithm achieves significant performance improvement in all studied conditions and shows the superiority especially for the severely noisy and strongly reverberant environment.

키워드

참고문헌

  1. E.A.P. Habets, "Single- and multi-microphone speech dereverberation using spectral enhancement," 2007.
  2. A.K. Nabelek, T.R. Letowski, and F.M. Tucker, "Reverberant overlap-and self-masking in consonant identification," J. Acoust. Soc. Am., vol. 86, no. 4, pp. 1259-65, 1989. https://doi.org/10.1121/1.398740
  3. L.L. Beranek, Concert and opera halls: how they sound, Published for the Acoustical Society of America through the American Institute of Physics, 1996.
  4. L.E. Kinsler, A.R. Frey, A.B. Coppens, and J.V. Sanders, Fundamentals of Acoustics, 4th Edition, pp. 560. ISBN 0-471-84789-5. Wiley-VCH, December 1999.
  5. H. Kuttruff, Room acoustics, Taylor & Francis, 2000.
  6. M. Miyoshi and Y. Kaneda, "Inverse filtering of room acoustics," , IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 36, no. 2, pp. 145-152, 1988. https://doi.org/10.1109/29.1509
  7. J. Mourjopoulos, P. Clarkson, and J. Hammond, "A comparative study of leastsquares and homomorphic techniques for the inversion of mixed phase signals," IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'82., vol. 7, pp. 1858- 1861, 1982.
  8. S. Treitel and EA Robinson, "The design of highresolution digital filters," IEEE Transactions on Geoscience Electronics, vol. 4, no. 1, pp. 25-38, 1966. https://doi.org/10.1109/TGE.1966.271203
  9. D. Bees, M. Blostein, and P. Kabal, "Reverberant speech enhancement using cepstral processing," International Conference on Acoustics, Speech, and Signal Processing, ICASSP-91., pp. 977-980, 1991.
  10. RA Kennedy and BD Radlovic, "Iterative cepstrum-based approach for speech dereverberation," Proceedings of the Fifth International Symposium on Signal Processing and Its Applications, ISSPA'99., vol. 1, pp. 55-58, 1999.
  11. A.P. Petropulu and S. Subramaniam, "Cepstrum based deconvolution for speech dereverberation," IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-94., vol. 1, pp. I-9, 1994.
  12. S. Subramaniam, A.P. Petropulu, and C. Wendt, "Cepstrum-based deconvolution for speech dereverberation," IEEE Transactions on Speech and Audio Processing, vol. 4, no. 5, pp. 392-396, 1996. https://doi.org/10.1109/89.536934
  13. B. Yegnanarayana, P. Satyanarayana Murthy, C. Avendano, and H. Hermansky, "Enhancement of reverberant speech using lp residual," Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 405-408, 1998.
  14. B. Yegnanarayana and P.S. Murthy, "Enhancement of reverberant speech using lp residual signal," IEEE Transactions on Speech and Audio Processing, vol. 8, no. 3, pp. 267- 281, 2000. https://doi.org/10.1109/89.841209
  15. T. Nakatani, M. Miyoshi, and K. Kinoshita, "Implementation and effects of single channel dereverberation based on the harmonic structure of speech," In Proc. IWAENC2003. Citeseer, 2003.
  16. T. Nakatani and M. Miyoshi, "Blind dereverberation of single channel speech signal based on harmonic structure," In Proceedings IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'03), vol. 1, pp. I-92, 2003.
  17. T. Nakatani, K. Kinoshita, and M. Miyoshi, "Harmonicitybased blind dereverberation for single-channel speech signals," IEEE Transactions on Audio, Speech, and Language Processing, vol. v15, no. 1, pp. 80-95, 2007. https://doi.org/10.1109/TASL.2006.872620
  18. K. Lebart, J.M. Boucher, and PN Denbigh, "A new method based on spectral subtraction for speech dereverberation," Acta Acustica united with Acustica, vol. 87, no. 3, pp.359-366, 2001.
  19. E.A.P. Habets, N.D. Gaubitch, and P.A. Naylor, "Temporal selective dereverberation of noisy speech using one microphone," IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008., pp. 4577-4580, 2008.
  20. E.A.P. Habets, S. Gannot, and I. Cohen, "Late reverberant spectral variance estimation based on a statistical model," IEEE Signal Processing Letters, vo. 16, no. 9, pp. 770-773, 2009. https://doi.org/10.1109/LSP.2009.2024791
  21. H.W. Lollmann and P. Vary, "A blind speech enhancement algorithm for the suppression of late reverberation and noise," IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009., pp. 3989-3992, 2009.
  22. E.A.P. Habets, N.D. Gaubitch, and P.A. Naylor. Temporal selective dereverberation of noisy speech using one microphone. In Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, pages 4577-4580. IEEE, 2008.
  23. J.S. Erkelens and R. Heusdens, "Single-microphone late-reverberation suppression in noisy speech by exploiting long-term correlation in the dft domain," IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009., pp. 3997-4000, 2009.
  24. J.S. Erkelens and R. Heusdens, "Noise and latereverberation suppression in time-varying acoustical environments," IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 4706-4709, 2010.
  25. Myung-Suk Song and Hong-Goo Kang, "Single-Channel Dereverberation using a Non-Causal MVDR Filter," Journal of the Acoustical Society of America Express Letter, vol. 132, no. 1, pp. 29-35, 2012. https://doi.org/10.1121/1.4722171
  26. Benesty, J. and Chen, J, Optimal Time-Domain Noise Reduction Filters: A Theoretical Study, vol. 1. Springer-Verlag New York Inc, 2011.
  27. Accardi, A.J. and Cox, R.V, "A modular approach to speech enhancement with an application to speech coding," IEEE International Conference on Acoustics, Speech, and Signal Processing, 1999. ICASSP'99., pp. 201-204, 1999.
  28. Allen, J.B. and Berkley, D.A, "Image method for efficiently simulating small-room acoustics," J. Acoust. Soc. Am, vol. 65, no. 4, pp. 943-950, 1979. https://doi.org/10.1121/1.382599
  29. Schroeder, M.R, "New method of measuring reverberation time," The Journal of the Acoustical Society of America, vol. 37, pp. 409, 1965. https://doi.org/10.1121/1.1909343
  30. R. McAulay and M. Malpass, "Speech enhancement using a soft-decision noise suppression filter," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 28, no. 2, pp. 137-145, 1980. https://doi.org/10.1109/TASSP.1980.1163394
  31. I. Cohen and B. Berdugo, "Speech enhancement for non-stationary noise environments," Signal processing, vol. 81, no. 11, pp. 2403-2418, 2001. https://doi.org/10.1016/S0165-1684(01)00128-1
  32. D. Malah, R.V. Cox, and A.J. Accardi, "Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments," IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'99., vol. 2, pp. 789- 792, 1999.
  33. R. Martin, Spectral subtraction based on minimum statistics, EUSIPCO, pp. 6-8, 1994.
  34. R. Martin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Transactions on Speech and Audio Processing, vol. 9, no. 5, pp. 504-512, 2001. https://doi.org/10.1109/89.928915
  35. I. Cohen and B. Berdugo, "Noise estimation by minima controlled recursive averaging for robust speech enhancement," IEEE Signal Processing Letters, vol. 9, no. 1, pp. 12-15, 2002. https://doi.org/10.1109/97.988717
  36. I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging," IEEE Transactions on Speech and Audio Processing, vol. 11, no. 5, pp. 466-475, 2003. https://doi.org/10.1109/TSA.2003.811544
  37. Y. Ephraim and D. Malah, "Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 32, no. 6, pp. 1109-1121, 1984. https://doi.org/10.1109/TASSP.1984.1164453
  38. J. Benesty, J. Chen, and E.A.P. Habets, Speech Enhancement in the STFT Domain, Springer Verlag, 2011.
  39. J. Benesty and Y. Huang, "A single-channel noise reduction mvdr filter," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 273-276, 2011.
  40. A. Varga, H.J.M. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech Commun., vol. 12, no. 3, pp. 247-251, 1993. https://doi.org/10.1016/0167-6393(93)90095-3