Browse > Article

Voice Activity Detection Based on Non-negative Matrix Factorization  

Kang, Sang-Ick (인하대학교 전자공학과 DSP연구실)
Chang, Joon-Hyuk (인하대학교 전자공학과)
Abstract
In this paper, we apply a likelihood ratio test (LRT) to a non-negative matrix factorization (NMF) based voice activity detection (VAD) to find optimal threshold. In our approach, the NMF based VAD is expressed as Euclidean distance between noise basis vector and input basis vector which are extracted through NMF. The optimal threshold each of noise environments depend on NMF results distribution in noise region which is estimated statistical model-based VAD. According to the experimental results, the proposed approach is found to be effective for statistical model-based VAD using LRT.
Keywords
Voice Activity Detection; Non-negative Matrix Factorization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. D. Cho and A. Kondoz, "Analysis and improvement of a statistical model-based voice activity detector," IEEE Sig. Process. Lett., Vol.8, No. 10, pp.276-278, Oct. 2001.   DOI   ScienceOn
2 D. D. Lee and H. S. Seung, "Algorithms for Non-negative Matrix Factorization," In Advances in Neural Information Processing Systems, Vol.13, pp.556 - 562, 2001.
3 Y. C. Lee and S. S. Ahn, " Statistical model-based VAD algorithm with wavelet Transform," IEICE Trans. Fundamentals, VoI.E89-A, No.6, pp.1594-1600, June 2006.   DOI   ScienceOn
4 J. Ramirez, J. M. Gorriz, J. C. Segura, C. G. Puntonet, and A. J. Rubio, "Speech / nonspeech discrimination based on contextual information integrated bispectrum LRT," IEEE Sig. Process. Lett., Vol.13, No.8, pp.497-500, Aug. 2006.   DOI
5 J. -H. Chang, J. W. Shin, and N. S. Kim, "Voice activity detector employing generalised Gaussian distribution," Electron. Lett., Vol.40, No.24, pp.1561-1563, Nov. 2004.   DOI   ScienceOn
6 J. Sohn, N. S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Sig. Process. Lett., Vol.6, No.1, pp.1-3, Jan. 1999.   DOI   ScienceOn
7 Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoustics, Speech, Sig. Process., VoI.ASSP-32, No.6, pp.1190-1121, Dec. 1984.
8 A. Varga and H. J. M. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech Communication, Vol.12, No.3, pp.247-251, 1993.   DOI   ScienceOn
9 D. D. Lee and H. S. Seung, "Learning the parts of objects by non-negative matrix factorization," Nature, Vol.401, pp.788-791, Oct. 1999.   DOI   ScienceOn
10 J. -H. Chang, N. S. Kim, and S. K. Mitra, "Voice activity detection based on multiple statistical models," IEEE Trans. Sig. Process., Vol.54, No.6, pp.1965-1976, June 2006.   DOI
11 J. Sohn and W. Sung, "A voice activity detector employing soft decision based noise spectrum adaptation," Proc. Int. Conf Acoustics, Speech, and Sig. Process., Vol.1, pp. 365-368, May 1998.