Browse > Article
http://dx.doi.org/10.5916/jkosme.2008.32.5.768

Voice Activity Detection Based on Signal Energy and Entropy-difference in Noisy Environments  

Ha, Dong-Gyung (한양대학교 컴퓨터.제어.전자통신공학부)
Cho, Seok-Je (한양대학교)
Jin, Gang-Gyoo (한양대학교 컴퓨터.제어.전자통신공학부)
Shin, Ok-Keun (한양대학교 컴퓨터.제어.전자통신공학부)
Abstract
In many areas of speech signal processing such as automatic speech recognition and packet based voice communication technique, VAD (voice activity detection) plays an important role in the performance of the overall system. In this paper, we present a new feature parameter for VAD which is the product of energy of the signal and the difference of two types of entropies. For this end, we first define a Mel filter-bank based entropy and calculate its difference from the conventional entropy in frequency domain. The difference is then multiplied by the spectral energy of the signal to yield the final feature parameter which we call PEED (product of energy and entropy difference). Through experiments. we could verify that the proposed VAD parameter is more efficient than the conventional spectral entropy based parameter in various SNRs and noisy environments.
Keywords
Entropy; Entropy difference; Signal energy; Voice activity detection; VAD(VAD); Noise reduction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. M. Górriz, J. Ramírez, C. G. Puntonet and J. C. Segura, 'An Efficient Bispectrum Phase Entropy- based Algorithm for VAD,' Interspeech 2006-ICSLP, Pittsburgh, Pennsylvania, USA, 17-19, Sep. 2006
2 B. F. Wu and K. C. Wang, 'Voice Activity Detection Based on Auto Correlation Function Using Wavelet Transform,' Computational Linguistics and Chinese Language Processing, Vol.11, No.1, pp.87-100, March 2006
3 S. S. Stevens and J. Volkmann, 'A Scale for the Measurement of the Psychological Magnitude Pitch,' The Journal of the Acoustical Society of America, Vol.8, Issue 3, pp.185-190, Jan. 1937   DOI
4 S. V. Gerven and F. Xie, 'A comparative study of speech detection methods,' Eurospeech, pp.1095-1098, 1997
5 B. F. Wu and K. C. Wang, 'A Noise Estimator with Rapid Adaptation in Variable-Level Noisy Environments,' Proc. of R.O.C. Computational Linguistics Conference, ROCLING XVI, 2-3, Sep. Taipei, Taiwan, pp.33-38, 2004
6 Gemello, R, Mana, F, De Mori, R, 'A modified Ephraim-Malah noise suppression rule for automatic speech recognition', Proc. ICASSP 2004, Vol. 1. pp. 957-960, 2004
7 R. R. Venkatesha Prasad, R. Muralishankar, Vijay S., H. N. Shankar, P. Pawelczak and I. G. M. M. Niemegeers, 'Voice Activity Detection for VoIP-An Information Theoretic Approach,' in Proc. 49th IEEE Global Telecommunications Conference (IEEE GLOBECOM 2006), San Francisco, CA, USA, 27 Nov. - 1 Dec. 2006
8 L. Rabiner, B. H. Juang, 'Fundmentals of speech recognition', Prentice Hall, 1993
9 P. Renevey and A. Drygajlo, 'Entropy based voice activity detection in very noisy conditions', Proc. of Eurospeech, pp.18871890, 2001
10 Shannon, C. E., 'A mathematical theory of communication,' Bell System Technical Journal, vol.27, pp.379423, 623-656, Oct. 1948   DOI
11 J. Ramírez, J. C. Segura, C. Benítez, A. de laTorre and A. Rubio, 'An effective subband OSF-based VAD with noise reduction for robust speech recognition,' IEEE Trans. on Speech and Audio Processing, Vol.13, No.6, pp.11191129, Nov. 2005   DOI   ScienceOn