DOI QR코드

DOI QR Code

A Single-Channel Speech Dereverberation Method Using Sparse Prior Imposition in Reverberation Filter Estimation

반향 필터 추정에서 성김 특성을 이용한 단일채널 음성반향제거 방법

  • Received : 2013.09.25
  • Accepted : 2013.12.10
  • Published : 2013.12.31

Abstract

Since a reverberation filter is generally much shorter than the corresponding dereverberation filter, a single-channel speech dereverberation method based on reverberation filter estimation has been developed to improve its performance. Unfortunately, a typical reverberation filter still requires too many coefficients to be accurately estimated using limited speech observations. In order to exploit sparseness of reverberation filter coefficients, in this paper, we present an algorithm to impose a sparse prior to the process of reverberation filter estimation. Simulation results demonstrate that the sparse prior imposition further improves performance of the speech dereverberation method based on reverberation filter estimation.

Keywords

References

  1. Haas, H. (1972). The influence of a single echo on the audibility of speech. Journal of the Audio Engineering Society, 20(2), 146-159.
  2. Furui, S. (2001). Digital speech processing, synthesis, and recognition. Marcel Dekker.
  3. Lambert, R. H. & Bell, A. J. (1997). Blind separation of multiple speakers in a multipath environment. Proc. IEEE ICASSP, 423-426.
  4. Douglas, S. C., Sawada, H., & Makino, S. (2005). Natural gradient multichannel blind deconvolution and speech separation using causal FIR filters. IEEE Transactions on Speech and Audio Processing, 13, 92-104. https://doi.org/10.1109/TSA.2004.838538
  5. Hopgood, J. R. & Rayner, P. J. W. (2003). Blind single channel deconvolution using nonstationary signal processing. IEEE Transactions on Speech and Audio Processing, 11, 476-488. https://doi.org/10.1109/TSA.2003.815522
  6. Nakatani, T., Kinoshita, K., & Miyoshi, M. (2007). Harmonicity-based blind dereverberation for single-channel speech signals. IEEE Transactions on Audio, Speech, and Language Processing, 15, 80-95. https://doi.org/10.1109/TASL.2006.872620
  7. Furuya, K. & Kataoka, A. (2007). Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1579-1591. https://doi.org/10.1109/TASL.2007.898456
  8. Miyoshi, M., Delcroix, M., & Kinoshita, K. (2008). Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction. IEICE Transactions on Fundamentals, E91-A(6), 1579-1591.
  9. Kokkinakis, K. & Nandi, A. K. (2006). Multichannel blind deconvolution for source separation in convolutive mixtures of speech. IEEE Transactions on Audio, Speech, and Language Processing, 14(1), 200-212. https://doi.org/10.1109/TSA.2005.854109
  10. Zee, M.-S. & Park, H.-M. (2009). Speech dereverberation based on blind estimation of a reverberation filter. IEICE Electronics Express, 6(20), 1456-1461. https://doi.org/10.1587/elex.6.1456
  11. Hyvarinen, A. & Raju, K. (2002). Imposing sparsity on the mixing matrix in independent component analysis. Neurocomputing, 49, 151-162. https://doi.org/10.1016/S0925-2312(02)00512-X
  12. Park, H.-M., Oh, S.-H., & Lee, S.-Y. (2006). Blind deconvolution with sparse priors on the deconvoluton filters. Proc. International Conference on Independent Component Analysis and Blind Signal Separation, 658-665.
  13. Lambert, R. H. (1996). Multichannel blind deconvolution: FIR matrix algebra and separation of multipath mixtures. Ph.D. Dissertation, University of Southern California, Los Angeles.
  14. Torkkola, K. (1997). Blind deconvolution, information maximization and recursive filters. Proc. IEEE ICASSP, 3301-3304.
  15. Lewicki, M. S. & Sejnowski, T. J. (2000). Learning overcomplete representations. Neural Computation, 12, 337-365. https://doi.org/10.1162/089976600300015826
  16. Radlovic, B. D. & Kennedy, R. A. (2000). Nonminimumphase equalization and its subjective importance in room acoustics. IEEE Transactions on Speech and Audio Processing, 8(6), 476-488.
  17. Mourjopoulos, J. (1985). On the variation and invertibility of room impulse response functions. Journal of Sound and Vibration, 102(2), 217-228. https://doi.org/10.1016/S0022-460X(85)80054-7
  18. Garofolo, J. S., Lamel, L. F., Fisher, W. M., Fiscus, J. G., Pallett, D. S., & Dahlgren, N. L. (1993). DARPA TIMIT acoustic-phonetic continuous speech corpus. Gaithersburgh, MD: Tech. Rep. NISTIR 4930.
  19. Sklar, B. (Ed.). (1988). Digital communications (fundamentals and applications). Englewood Cliffs, NJ: Prentice-Hall.
  20. Allen, J. B. & Berkley, D. A. (1979). Image method for efficiently simulating small-room acoustics. Journal of the Acoustical Society of America, 65(4), 943-950. https://doi.org/10.1121/1.382599