DOI QR코드

DOI QR Code

Improved speech enhancement of multi-channel Wiener filter using adjustment of principal subspace vector

다채널 위너 필터의 주성분 부공간 벡터 보정을 통한 잡음 제거 성능 개선

  • Kim, Gibak (School of Electrical Engineering, Soongsil University)
  • Received : 2020.07.31
  • Accepted : 2020.09.09
  • Published : 2020.09.30

Abstract

We present a method to improve the performance of the multi-channel Wiener filter in noisy environment. To build subspace-based multi-channel Wiener filter, in the case of single target source, the target speech component can be effectively estimated in the principal subspace of speech correlation matrix. The speech correlation matrix can be estimated by subtracting noise correlation matrix from signal correlation matrix based on the assumption that the cross-correlation between speech and interfering noise is negligible compared with speech correlation. However, this assumption is not valid in the presence of strong interfering noise and significant error can be induced in the principal subspace accordingly. In this paper, we propose to adjust the principal subspace vector using speech presence probability and the steering vector for the desired speech source. The multi-channel speech presence probability is derived in the principal subspace and applied to adjust the principal subspace vector. Simulation results show that the proposed method improves the performance of multi-channel Wiener filter in noisy environment.

본 논문에서는 잡음 환경에서 다채널 위너 필터의 성능을 향상시키기 위한 방법을 제안한다. 부공간(subspace) 기반의 다채널 위너 필터를 설계하는 경우, 목적 신호가 단일 음원인 경우는 음성 상관 행렬의 주성분 부공간에서 음성 성분을 추정할 수 있다. 이 때, 음성 상관 행렬은 음성과 간섭 잡음의 교차 상관도가 음성 상관 행렬에 비해 무시할만한 수준이라는 가정하에 신호 상관 행렬에서 간섭 잡음의 상관 행렬을 차감하여 추정하게 된다. 그러나 간섭 잡음 수준이 높아지게 되면 이러한 가정이 더 이상 유효하지 않게 되며 이에 따라 주성분 부공간 추정 오차도 증가하게 된다. 본 연구에서는 음성 존재 확률과 목적 신호의 방향 벡터를 이용하여 주성분 부공간을 보정하는 방법을 제안한다. 주성분 부공간에서 다채널 음성 존재 확률을 유도하고 주성분 부공간 벡터를 보정하는데 적용하였다. 실험을 통해 제안하는 방법이 잡음 환경에서 다채널 위너 필터의 성능을 향상시키는 것을 확인할 수 있다.

Keywords

References

  1. G. Kim, "Interference suppression using principal subspace modification in multichannel Wiener filter and its application to speech recognition," ETRI J. 32, 921-931 (2010). https://doi.org/10.4218/etrij.10.0110.0045
  2. K. Ngo, A. Spriet, M. Moonen, J. Wouter, and S. H. Jensen, "Incorporating the conditional speech presence probability in multi-channel Wiener filter based noise reduction in hearing aids," Eurasip J. Advances in Signal Processing, 2009, 1-11 (2009).
  3. K. Ngo, M. Moonen, S. H. Jensen, and J. Wouters, "A flexible speech distortion weighted multi-channel Wiener filter for noise reduction in hearing aids," ICASSP. 2528-2531 (2011).
  4. T. C. Lawin-Ore, S. Stenzel, J. Freudenberger, and S. Doclo, "Generalized multichannel wiener filter for spatially distributed microphones," Speech Communication;. 11. ITG Symposium, 1-4 (2014).
  5. R. Serizel, M. Moonen, B. Van Dijk, and J. Wouters, "Low-rank approximation based multichannel Wiener filter algorithms for noise reduction with application in cochlear implants," IEEE/ACM Trans. Audio, Speech, and Lang. Process. 22, 785-798 (2014). https://doi.org/10.1109/TASLP.2014.2304240
  6. Z. Wang, E. Vincent, R. Serizel, and Y. Yan, "Rank-1 constrained multichannel Wiener filter for speech recognition in noisy environments," Computer Speech & Language, 49, 37-51 (2018). https://doi.org/10.1016/j.csl.2017.11.003
  7. S. Bagheri and D. Giacobello, "Exploiting multichaannel speech presence probability in parametric multi-channel Wiener filter," Interspeech, 101-105 (2019).
  8. M. Souden, J. Chen, J. Benesty, and S. Affes, "Gaussian model-based multichannel speech presence probability," IEEE Trans. Audio, Speech, Lang. Process. 18, 1072-1077 (2010). https://doi.org/10.1109/TASL.2009.2035150
  9. Y. G. Jin, J. W. Shin, and N. S. Kim, "Decisiondirected speech power spectral density matrix estimation for multichannel speech enhancement," J. Acoust. Soc. Am. 141, EL234 (2017). https://doi.org/10.1121/1.4977583
  10. S. Jeong and Y. Kim, "An optimally-modified multichannel Wiener filter using speech presence probability" (in Korean), Smart Media J. 7, 9-15 (2018).
  11. G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed. (Johns Hopkins University Press, Baltimore, 1996), Chap. 8.
  12. H. Van Trees, Optimum Array Processing: Part IV of Detection, Estimation and Modulation Theory (Wiley, Hoboken, 2002), Chap. 2.
  13. S. Nakamura, K. Hiyane, F. Asano, T. Nishiura, and T. Yamada "Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition," Proc. the 2nd Int. Conf. LREC. 965-968 (2000).