DOI QR코드

DOI QR Code

Noise Statistics Estimation Using Target-to-Noise Contribution Ratio for Parameterized Multichannel Wiener Filter

변수내장형 다채널 위너필터를 위한 목적신호대잡음 기여비를 이용한 잡음추정기법

  • Hong, Jungpyo (Department of Information and Communication Engineering, Changwon National University)
  • Received : 2022.12.01
  • Accepted : 2022.12.12
  • Published : 2022.12.31

Abstract

Parameterized multichannel Wiener filter (PMWF) is a linear filter that can control the trade-off between residual noise and signal distortion using the embedded parameter. To apply the PMWF to noisy inputs, accurate noise estimation is important and multichannel minima-controlled recursive averaging (MMCRA) is widely used. However, in the case of the MMCRA, the accuracy of noise estimation decreases when a directional interference is involved into the array inputs. Consequently, the performance of the PMWF is degraded. Therefore, we propose a noise power spectral density (PSD) estimation method for the PMWF in this paper. The proposed method is based on a consecutive process of eigenvalue decomposition on noisy input PSD, estimation of the target component contribution using directional information, and exponential weighting for improved estimation of the target contribution. For evaluation, four objective measures were compared with the MMCRA and we verify that the PMWF with the proposed noise estimation method can improve performance in environments where directional interfereces exist.

변수내장형 다채널 위너 필터는 내장된 변수를 이용하여 잔여잡음과 신호왜곡 간의 트레이드오프를 조절할 수 있는 선형 필터이다. 이러한 변수내장형 다채널 위너필터를 적용하기 위해서는 정확한 잡음추정이 중요한데 널리 쓰이는 다채널 최소 제어 재귀 평균 기법이 있다. 하지만 다채널 최소 제어 재귀 평균 기법은 방향성 간섭 신호가 존재할 경우 잡음추정의 정확도가 하락하여 변수내장형 다채널 위너필터의 성능이 저하되는 문제점이 있다. 따라서, 본 논문에서는 변수내장형 다채널 위너필터를 위한 새로운 잡음 추정 기법을 제안한다. 제안한 방법은 주로 잡음 섞인 마이크로폰 입력 신호의 전력 스펙트럼 밀도에 대해 고유값 분해, 방향성 정보를 이용한 목적신호의 기여도 추정, 목적신호의 기여도를 보다 합리적으로 추정하기 위한 지수 가중치 부가의 일련의 과정을 수행한다. 제안한 방법을 평가하기 위해 신호대잡음비, 음성왜곡도 등의 총 4가지 객관적 성능 평가 방법을 이용하여 기존의 방법과 비교하였다. 실험을 통해 방향성 간섭신호가 존재하는 환경에서 제안한 잡음 추정기법을 적용한 다채널 위너필터의 성능이 향상됨을 확인하였다.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No.2022R1G1A1008798 ).

References

  1. O. Schwartz, S. Gannot, and E. A. P. Habets, "Multispeaker LCMV Beamformer and Postfilter for Source Separation and Noise Reduction," IEEE Transaction Audio Speech Language Processing, vol. 25, no. 5, pp. 940-951, May 2017. https://doi.org/10.1109/TASLP.2017.2655258
  2. Y. Kubo, T. Nakatani, M. Delcroix, K. Kinoshita, and S. Araki, "Mask-based MVDR Beamformer for Noisy Multisource Environments: Introduction of Time-varying Spatial Covariance Model," in Proceeding of the IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, pp. 6855-6859, 2019.
  3. P. Rakesh, S. S. Priyanka, and T. Kumar, "Performance evaluation of beamforming techniques for speech Enhancement," in Proceedings of Fourth International Conference on Signal Processing Communication and Networking, Chennai, India, pp. 1-5, 2017.
  4. J. Park, J. Hong, J. Choi, and M. Hahn, "Determinant-based Generalized Sidelobe Canceller for Dual-Sensor Noise Reduction," IEEE Sensors Journal, vol. 22, no. 9, pp. 8858-8868, May 2022.
  5. S. M. Kim "Hearing Aid Speech Enhancement Using Phase Difference-Controlled Dual-Microphone Generalized Sidelobe Canceller," IEEE Access, vol. 7, no. 9, pp. 130663-130671, Sep. 2019. https://doi.org/10.1109/ACCESS.2019.2940047
  6. J. Kim and M. Hahn, "Speech Enhancement Using a Two-Stage Network for an Efficient Boosting Strategy," IEEE Signal Processing Letter, vol. 26, no. 5, pp. 770-774, Mar. 2019. https://doi.org/10.1109/lsp.2019.2905660
  7. J. Lee and H. G. Kang, "A Joint Learning Algorithm for Complex-Valued T-F Masks in Deep Learning-Based Single-Channel Speech Enhancement Systems," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 6, pp. 1098-1108, Jun. 2019. https://doi.org/10.1109/taslp.2019.2910638
  8. D. Wang and J. Chen, "Supervised Speech Separation Based on Deep Learning: An Overview," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 10, pp. 1702-1726, Jun. 2018. https://doi.org/10.1109/taslp.2018.2842159
  9. S. Pascual, A. Bonafonte, and J. Serra, "SEGAN: Speech Enhancement Generative Adversarial Network," in Proceeding of Interspeech, Stockholm, Sweden, pp. 3642-3646, 2017.
  10. L. Zhang, M. Wang, Q. Zhang, X. Wang, and M. Liu, "PhaseDCN: A Phase-Enhanced Dual-Path Dilated Convolutional Network for Single Channel Speech Enhancement," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2561-2574, Jun. 2021. https://doi.org/10.1109/TASLP.2021.3092585
  11. S. Jeong and Y.Kim, "An Optimally-Modified Multichannel Wiener Filter Using Speech Presence Probability," Smart Media Journal, vol. 7, no. 3, pp. 9-15, Sep. 2018. https://doi.org/10.30693/SMJ.2018.7.3.9
  12. M. Souden, J. Chen, J. Benesty, and S. Affes, "An Integrated Solution for Online Multichannel Noise Tracking and Reduction," IEEE Transactions Audio, Speech, Language Processing, vol. 19, no. 7, pp. 2159-2169, Sep. 2011. https://doi.org/10.1109/TASL.2011.2118205
  13. I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging," IEEE Transactions Speech Audio Processing, vol. 11, no. 5, pp. 466-475, Sep. 2003. https://doi.org/10.1109/TSA.2003.811544
  14. M. H. Hayes, Statistical Digital Signal Processing and Modeling, USA: Wiley, 1996.
  15. F. Asano, S. Hayamizu, T. Yamada, and S. Nakamura, "Speech enhancement based on the subspace method," IEEE Transaction Audio Speech Language Processing, vol. 8, no. 5, pp. 497-507, Sep. 2000. https://doi.org/10.1109/89.861364
  16. E. Warsitz and R. Haeb-Umbach, "Blind Acoustic Beamforming Based on Generalized Eigenvalue Eecomposition," IEEE Transaction Audio Speech Language Processing, vol. 15, no. 5, pp. 1529-1539, Jul. 2007. https://doi.org/10.1109/TASL.2007.898454
  17. P. C. Loizou, Speech Enhancement: Theory and Practice, Boca Raton, FL, USA: CRC, 2007.
  18. D. Pearce and H. Hirsch, "The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions," in Proceedings of Sixth International Conference on Spoken Language Processing, ICSLP 2000 / INTERSPEECH 2000, Beijing, China, pp. 16-20, 2000.
  19. J. B. Allen and D. A. Berkley, "Image method for efficiently simulating small-room acoustics," The Journal of the Acoustical Society of America, vol. 65, no. 4, pp. 943-950, Apr. 1979. https://doi.org/10.1121/1.382599
  20. E. A. P. Habets, "Generating sensor signals in isotropic noise fields," Journal of the Acoustical Society of America, vol. 122, no. 6, pp. 3464-3470, Dec. 2007. https://doi.org/10.1121/1.2799929