Browse > Article
http://dx.doi.org/10.7776/ASK.2007.26.8.397

Voice Activity Detection Based on SVM Classifier Using Likelihood Ratio Feature Vector  

Jo, Q-Haing (인하대학교 전자전기공학부)
Kang, Sang-Ki (삼성전자 정보통신총괄 통신연구소)
Chang, Joon-Hyuk (인하대학교 전자전기공학부)
Abstract
In this paper, we apply a support vector machine(SVM) that incorporates an optimized nonlinear decision rule over different sets of feature vectors to improve the performance of statistical model-based voice activity detection(VAD). Conventional method performs VAD through setting up statistical models for each case of speech absence and presence assumption and comparing the geometric mean of the likelihood ratio (LR) for the individual frequency band extracted from input signal with the given threshold. We propose a novel VAD technique based on SVM by treating the LRs computed in each frequency bin as the elements of feature vector to minimize classification error probability instead of the conventional decision rule using geometric mean. As a result of experiments, the performance of SVM-based VAD using the proposed feature has shown better results compared with those of reported VADs in various noise environments.
Keywords
Voice activity detection; Support vector machine; Statistical model; Likelihood ratio;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. Srinivasant and Allen Gersho, 'Voice activity detection for cellular networks,' Proc. IEEE Speech Coding Workshop, 85-86, Oct. 1993
2 Y. D. Cho and A. Kondoz, 'Analysis and improvement of a statistical model-based voice activity detector,' IEEE Sig. Process. Lett., 8 (10), 276-278, Oct. 2001   DOI   ScienceOn
3 J.-H. Chang, J. W. Shin, and N. S. Kim, 'Voice activity detector employing generalized gaussian distribution,' Electron. Lett., 40 (24), 1561-1563, Nov. 2004   DOI   ScienceOn
4 J. Ramirez, J. M. Gorriz, J. C. Segura, C. G. Puntonet, and A. J. Rubio, 'Speech/non-speech discrimination based on contextual information integrated bispectrum LRT,' IEEE Sig. Process. Lett., 13 (8), 497-500, Aug. 2006   DOI   ScienceOn
5 J. Sohn and W. Sung, 'A voice activity detector employing soft decision based noise spectrum adaptation,' Proc. Int. Conf. Acoustics, Speech, and Sig. Process., 1, 365-368, May 1998
6 ITU, 'A silence compression scheme for G.729 optimized for terminals conforming to ITU-T V.70,', ITU-T Rec. G. 729, Annex S, 1996
7 J. Sohn, N. S. Kim, and W. Sung, 'A statistical model-based voice activity detection,' IEEE Sig. Process. Lett., 6 (1), 1-3, Jan. 1999
8 N. Cristianini and J. Shawe-Taylor, An introduction to support vector machines and other kernel-based learning methods. (Cambridge Univ. Press, 2000)
9 V. N Vapnik, 'An overview of statistical learning theory,' IEEE Trans. Neural Networks, 10 (5), 988-999, Sep. 1999   DOI   ScienceOn
10 Y. C. Lee and S. S. Ahn, 'Statistical model-based VAD algorithm with Wavelet Transform,' IEICE Trans. Fundamentals., E89-A (6), 1594-1600, June 2006   DOI
11 J. Ramirez, J. M. Gorriz, J. C. Segura, C. G. Puntonet, and A. J. Rubio, 'Speech/Non-speech discrimination based on contextual information integrated bispectrum LRT,' IEEE Sig. Process. Lett., 13 (8), 497-500, Aug. 2006   DOI   ScienceOn
12 Y. Ephraim and D. Malah, 'Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,' IEEE Trans. Acoustics, Speech, Sig. Process., ASSP-32 (6), 1190-1121, Dec. 1984
13 J.-H. Chang, N. S. Kim, and S. K. Mitra, 'Voice activity detection based on multiple statistical models,' IEEE Trans. Sig. Process., 54 (6), 1965-1976, June 2006   DOI   ScienceOn
14 D. Enqing, L. Guizhong, Z. Yatong, and Z. Xiaodi, 'Applying support vector machines to voice activity detection,' Proc. Int. Conf. Sig. Process., 2, 1124-1127, Aug. 2002