Browse > Article
http://dx.doi.org/10.7776/ASK.2011.30.6.324

A Statistical Model-Based Voice Activity Detection Employing the Conditional MAP Criterion with Spectral Deviation  

Kim, Sang-Kyun (한양대학교 융합전자공학부)
Chang, Joon-Hyuk (한양대학교 융합전자공학부)
Abstract
In this paper, we propose a novel approach to improve the performance of a statistical model-based voice activity detection (VAD) which is based on the conditional maximum a posteriori (CMAP) with deviation. In our approach, the VAD decision rule is expressed as the geometric mean of likelihood ratios (LRs) based on adapted threshold according to the speech presence probability conditioned on both the speech activity decisions and spectral deviation in the pervious frame. Experimental results show that the proposed approach yields better results compared to the CMAP-based VAD using the LR test.
Keywords
Voice activity detection; Deviation; Conditional maximum a posteriori; Statistical model;
Citations & Related Records
연도 인용수 순위
  • Reference
1 3GPP2 Spec., "Enhanced Variable Rate Codec (EVRC)," 3Gpp2- C.S0014-0, vol. 1.0 Apr. 2004.
2 J.-H. Chang, J. W. Shin, and N. S. Kim, "Voice activity detector employing generalised gaussian distribution," Electron. Lett., vol. 40, no. 24, pp. 1561-1563, Nov. 2004.   DOI   ScienceOn
3 J.-H. Chang, N. S. Kim, and S. K. Mitra, "Voice activity detection based on multiple statistical models," IEEE Trans. Sig. Process., vol. 54, no. 6, pp. 1965-1976, June 2006.   DOI   ScienceOn
4 Y. C. Lee and S. S. Ahn, "Statistical model-based VAD algorithm with Wavelet Transform," IEICE Trans. Fundamentals., vol. E89-A, no. 6, pp. 1594-1600, June 2006.
5 J. Ramirez, J. M. Gorriz, J. C. Segura, C. G. Puntonet, and A. J. Rubio, "Speech/non-speech discrimination based on contextual information integrated bispectrum LRT," IEEE Sig. Process. Lett., vol. 13, no. 8, pp. 497-500, Aug. 2006.   DOI   ScienceOn
6 J. W. Shin, H. J. Kwon, S. H. Jin and N. S. Kim, "Voice activity detection based on conditional MAP criterion," IEEE Signal Processing Letters, vol. 15, pp. 257-260, Feb. 2008.   DOI   ScienceOn
7 K. Srinivasant and Allen Gersho, "Voice activity detection for cellular networks," Proc. IEEE Speech Coding Workshop, pp. 85-86, Oct. 1993.
8 Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoustics, Speech, Sig. Process., vol. ASSP-32, no. 6, pp. 1190-1121, Dec. 1984.
9 J. Sohn and W. Sung, "A voice activity detector employing soft decision based noise spectrum adaptation," Proc. Int. Conf. Acoustics, Speech, and Sig. Process., vol. 1, pp. 365-368, May 1998.
10 J. Sohn, N. S. Kim, and W. Sung, "A statistical modelbased voice activity detection," IEEE Sig. Process. Lett., vol. 6, no. 1, pp. 1-3, Jan. 1999.
11 Y. D. Cho and A. Kondoz, "Analysis and improvement of a statistical model-based voice activity detector," IEEE Sig. Process. Lett., vol. 8, no.10, pp. 276-278, Oct. 2001.   DOI   ScienceOn