Multi-channel Speech Enhancement Using Blind Source Separation and Cross-channel Wiener Filtering

Jang, Gil-Jin;Choi, Chang-Kyu;Lee, Yong-Beom;Kim, Jeong-Su;Kim, Sang-Ryong;

The Journal of the Acoustical Society of Korea

Volume 23 Issue 2E
/
Pages.56-67
/
2004
/
1225-4428(pISSN)

The Acoustical Society of Korea (한국음향학회)

Multi-channel Speech Enhancement Using Blind Source Separation and Cross-channel Wiener Filtering

Jang, Gil-Jin (Human Computer Interaction Laboratory, Samsung Advanced Institute of Technology) ;
Choi, Chang-Kyu (Human Computer Interaction Laboratory, Samsung Advanced Institute of Technology) ;
Lee, Yong-Beom (Human Computer Interaction Laboratory, Samsung Advanced Institute of Technology) ;
Kim, Jeong-Su (Human Computer Interaction Laboratory, Samsung Advanced Institute of Technology) ;
Kim, Sang-Ryong (Human Computer Interaction Laboratory, Samsung Advanced Institute of Technology)

Published : 2004.06.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Despite abundant research outcomes of blind source separation (BSS) in many types of simulated environments, their performances are still not satisfactory to be applied to the real environments. The major obstacle may seem the finite filter length of the assumed mixing model and the nonlinear sensor noises. This paper presents a two-step speech enhancement method with multiple microphone inputs. The first step performs a frequency-domain BSS algorithm to produce multiple outputs without any prior knowledge of the mixed source signals. The second step further removes the remaining cross-channel interference by a spectral cancellation approach using a probabilistic source absence/presence detection technique. The desired primary source is detected every frame of the signal, and the secondary source is estimated in the power spectral domain using the other BSS output as a reference interfering source. Then the estimated secondary source is subtracted to reduce the cross-channel interference. Our experimental results show good separation enhancement performances on the real recordings of speech and music signals compared to the conventional BSS methods.

Keywords

References

K. Torkkola, 'Blind signal separation for audio signals - are we there yet?,' in Proc. ICA99, (Aussois, France), pp.261-266, January 1999
S. Araki, S. Makino, R. Aichner, T. Nishikawa, and H. Saruwatari, 'Subband based blind source separation with appropriate processing for each frequency band,' in Proc. ICA2003, (Nara, Japan), pp.499-504, April 2003
B. Widrow, J. R. Glover, J. M. McCool, J. Kaunitz, C. S. Williams, R. H. Hearn, J. R. Zeidler, E. Dong, and R. C. Goodlin, 'Adaptive noise cancelling: principles and applications,' Proceedings of the IEEE, vol.63, pp. 1692-1716, December 1975 https://doi.org/10.1109/PROC.1975.10036
S. F. Boll, 'Suppression of acoustic noise in speech using spectral subtraction,' IEEE Trans. Acous, Speech and Signal Processing, ASSP, vol. 27, no. 2, pp.113-120, 1979 https://doi.org/10.1109/TASSP.1979.1163209
L. Parra and C. Spence, 'Convolutive blind separation of nonstationary sources,' IEEE Trans. Speech and Audio Processing, vol.8, pp.320-327, May 2000 https://doi.org/10.1109/89.841214
S. Choi, S. Amari, A. Cichocki, and R. wen LIU, 'Natural gradient learning with a nonholonomic constraint for blind deconvolution of multiple channels,' in Proc. ICA99, (Aussois, France), pp.371-376, January 1999
H. Sawada, R. Mukai, S. Araki, and S. Makino, 'Polar coordinate based nonlinear function for frequency domain blind source separation,' in Proc. ICASSP, (Orlando, Florida), May 2002
T.-W. Lee, A. J. Bell, and R. Orglmeister, 'Blind source separation of real world signals,' in Proc. ICNN, (Houston, USA), pp.2129-2135, June 1997
N. S. Kim and J.-H. Chang, 'Spectral enhancement based on global soft decision,' IEEE Signal Processing Letters, vol.7, pp.108-110, May 2000 https://doi.org/10.1109/97.841154
M.R. Weiss and E. Aschkenasy, 'Computerized audio processor,' Final Report, Rome Air Development Center RADC-TR-83-109, May 1983
M. Berouti, R. Schwartz, and J. Makhoul, 'Enhancement of speech corrupted by additive noise', In Proc. ICASSP79, pp. 208-11, 1979
G.-J. Jang, T.-W. Lee, and Y.-H. Oh, 'Learning statistically efficient features for speaker recognition,' in Proc. ICASSP, (Salt Lake City, Utah), May 2001
A. J. Bell and T, J. Sejnowski, 'Learning the higher order structures of a natural sound,' Network: Computation in Neural Systems, vol.7, pp.261-266, July 1996 https://doi.org/10.1088/0954-898X/7/2/005
E. Visser, M. Otsuka, and T.-W, Lee, 'A spatio-temporal speech enhancement scheme for robust speech recognition in noisy environments,' Speech Communications, vol.41, pp.393-407, 2003 https://doi.org/10.1016/S0167-6393(03)00010-4
C. Choi, D. Kong, S. M. Yoon, and H.-K. Lee, 'Separation of multiple concurrent speeches using audio-visual speaker localization and minimum variance beamforming,' in Proc. ICSLP, (Jeju, Korea), October 4-8, 2004

The Journal of the Acoustical Society of Korea

Multi-channel Speech Enhancement Using Blind Source Separation and Cross-channel Wiener Filtering

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)