Browse > Article

Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy in Noisy Environments  

Park, Yun-Sik (Department of Electronic Engineering, Inha University)
Lee, Sang-Min (Department of Electronic Engineering, Inha University)
Publication Information
Abstract
In this paper, we propose a novel voice activity detection (VAD) algorithm to effectively distinguish speech from nonspeech in various noisy environments. Global speech absence probability (GSAP) derived from likelihood ratio (LR) based on the statistical model is widely used as the feature parameter for VAD. However, the feature parameter based on conventional GSAP is not sufficient to distinguish speech from noise at low SNRs (signal-to-noise ratios). The presented VAD algorithm utilizes GSAP based on Teager energy (TE) as the feature parameter to provide the improved performance of decision for speech segments in noisy environment. Performances of the proposed VAD algorithm are evaluated by objective test under various environments and better results compared with the conventional methods are obtained.
Keywords
음향학적 반향 억제;잡음 제거;음성부재확률;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Karray, C. Mokbel and J. Monne, "Solutions for robust. speech/non-speech detection in wireless environment," presented at the IVTTA, Sep. 1988.
2 L. R. Rabiner and M. R. Sambur, "Voicedunvoiced- silence detection using the Itakura LPC distance measure," in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 323-326, May 1977.
3 J. Sohn, N. S. Kim and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Processing Letters, vol. 6, no. 1, pp. 1-3, Jan. 1999.
4 F. Jabloun, A. E. Cetin and E. Erzin, "Teager energy based feature parameters for speech recognition in car noise," IEEE Signal Processing Letters, vol. 6, pp. 259-261, 1999.   DOI
5 K. C. Wang and Y. H. Tsai, "Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy," Second International Symposium on Universal Communication 2008, pp. 423-428, Dec. 2008.
6 R. J. McAualy and M. L. Malpass, "Speech enhancement using a soft-decision noise suppression filter," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-28, pp. 137-145, Apr. 1980.
7 J. Sohn, W. Sung, "A voice activity detector employing soft decision based noise spectrum adaptation," in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pp. 365-368, 1998.
8 N. S. Kim and J.-H. Chang, "Spectral enhancement based on global soft decision," IEEE Signal Processing Letters, vol. 7, no. 5, pp. 108-110, May 2000.   DOI
9 Rix, A. W., Beerends, J. G., Hollier, M. P. and Hekstra, A. P. "Perceptual evaluation of speech quality (PESQ) - a new method for speech quality assessment of telephone networks and codecs," in Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 2, pp.749-752, May 2001.
10 Yi Hu and P. C. Loizou, "Evaluation of objective quality measures for speech enhancement," IEEE Trans. ASLP, vol. 16, pp. 229-238, Jan. 2008.
11 Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 6, pp. 1109-1121, Dec. 1984.