DOI QR코드

DOI QR Code

Speech Denoising via Low-Rank and Sparse Matrix Decomposition

  • Huang, Jianjun (Institute of Command Automation, People's Liberation Army University of Science and Technology) ;
  • Zhang, Xiongwei (Institute of Command Automation, People's Liberation Army University of Science and Technology) ;
  • Zhang, Yafei (Institute of Command Automation, People's Liberation Army University of Science and Technology) ;
  • Zou, Xia (Institute of Command Automation, People's Liberation Army University of Science and Technology) ;
  • Zeng, Li (Institute of Command Automation, People's Liberation Army University of Science and Technology)
  • 투고 : 2013.01.21
  • 심사 : 2013.05.14
  • 발행 : 2014.02.01

초록

In this letter, we propose an unsupervised framework for speech noise reduction based on the recent development of low-rank and sparse matrix decomposition. The proposed framework directly separates the speech signal from noisy speech by decomposing the noisy speech spectrogram into three submatrices: the noise structure matrix, the clean speech structure matrix, and the residual noise matrix. Evaluations on the Noisex-92 dataset show that the proposed method achieves a signal-to-distortion ratio approximately 2.48 dB and 3.23 dB higher than that of the robust principal component analysis method and the non-negative matrix factorization method, respectively, when the input SNR is -5 dB.

키워드

참고문헌

  1. P.C. Loizou, Speech Enhancement: Theory and Practice, Boca Raton, FL: CRC Press, 2007.
  2. J. Hao et al., "Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation," IEEE Trans. Audio, Speech, Lang. Process., vol. 17, no. 1, Jan. 2009, pp. 24-37. https://doi.org/10.1109/TASL.2008.2005342
  3. D.C. Balcan and J. Rosca, "Independent Component Analysis for Speech Enhancement with Missing TF Content," Proc. 6th Int. Conf. Independent Compon. Anal. Blind Signal Separation, Charleston, SC, USA, Mar. 5-8, 2006, pp. 552-560.
  4. K.W. Wilson et al., "Speech Denoising Using Nonnegative Matrix Factorization with Priors," IEEE Int. Conf. Acoust., Speech, Signal Proc., Las Vegas, NV, USA, Mar. 31-Apr. 4, 2008, pp. 4029-4032.
  5. M.N. Schmidt, J. Larsen, and F.T. Hsiao, "Wind Noise Reduction Using Non-negative Sparse Coding," IEEE Workshop Mach. Learning Signal Process., Thessaloniki, Greece, Aug. 27-29, 2007, pp. 431-436.
  6. C.D. Sigg, T. Dikk, and J.M. Buhmann, "Speech Enhancement with Sparse Coding in Learned Dictionaries," IEEE Int. Conf. Acoust., Speech, Signal Proc., Dallas, TX, USA, Mar. 14-19, 2010, pp. 4758-4761.
  7. P.S. Huang et al., "Singing-Voice Separation from Monaural Recordings Using Robust Principal Component Analysis," IEEE Int. Conf. Acoust., Speech, Signal Proc., Kyoto, Japan, Mar. 25-30, 2012, pp. 57-60.
  8. E.J. Candes et al., "Robust Principal Component Analysis?" J. ACM, vol. 58, no. 3, May 2011, pp. 11:1-37.
  9. T. Zhou and D. Tao, "GoDec: Randomized Low-Rank & Sparse Matrix Decomposition in Noisy Case," Proc. ICML, Bellevue, WA, USA, June 2011, pp. 33-40.
  10. Y. Li and D.L.Wang, "Musical Sound Separation Based on Binary Time-Frequency Masking," EURASIP J. Audio, Speech, Music Process., vol. 2009, July 2009, pp. 1-10.
  11. Rice University Digital Signal (DSP) Group, Noisex92 Noise Database, 1995. http://spib.rice.edu/spib/select_noise.html
  12. E. Vincent, R. Gribonval, and C. Fevotte, "Performance Measurement in Blind Audio Source Separation," IEEE Trans. Audio, Speech, Lang. Process., vol. 14, no. 4, July 2006, pp. 1462-1469. https://doi.org/10.1109/TSA.2005.858005
  13. G.J. Mysore and P. Smaragdis, "A Non-negative Approach to Semi-supervised Separation of Speech from Noise with the Use of Temporal Dynamics," IEEE Int. Conf. Acoust., Speech, Signal Proc., Prague, Czech Republic, May 22-27, 2011, pp. 17-20.

피인용 문헌

  1. Finding Top-k Answers in Node Proximity Search Using Distribution State Transition Graph vol.38, pp.4, 2014, https://doi.org/10.4218/etrij.16.0115.0229
  2. Robust Non-negative Matrix Factorization with β-Divergence for Speech Separation vol.39, pp.1, 2014, https://doi.org/10.4218/etrij.17.0115.0122
  3. Regularized sparse decomposition model for speech enhancement via convex distortion measure vol.32, pp.22, 2014, https://doi.org/10.1142/s0217984918502627
  4. Low rank sparse decomposition model based speech enhancement using gammatone filterbank and Kullback-Leibler divergence vol.21, pp.2, 2014, https://doi.org/10.1007/s10772-018-9500-2
  5. Hard component detection of transient noise and its removal using empirical mode decomposition and wavelet‐based predictive filter vol.12, pp.7, 2014, https://doi.org/10.1049/iet-spr.2017.0167