[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7776/ASK.2015.34.4.310

A Parametric Voice Activity Detection Based on the SPD-TE for Nonstationary Noises

Koo, Boneung (Department of Electronic Engineering, Kyonggi University)

Publication Information

The Journal of the Acoustical Society of Korea / v.34, no.4, 2015 , pp. 310-315 More about this Journal

Abstract

A single channel VAD (Voice Activity Detection) algorithm for nonstationary noise environment is proposed in this paper. Threshold values of the feature parameter for VAD decision are updated adaptively based on estimates of means and standard deviations of past non-speech frames. The feature parameter, SPD-TE (Spectral Power Difference-Teager Energy), is obtained by applying the Teager energy to the WPD (Wavelet Packet Decomposition) coefficients. It was reported previously that the SPD-TE is robust to noise as a feature for VAD. Experimental results by using TIMIT speech and NOISEX-92 noise databases show that decision accuracy of the proposed algorithm is comparable to several typical VAD algorithms including standards for SNR values ranging from 10 to -10 dB.

Keywords

Voice activity detection; Speech pause detection; Nonstationary noise; Noise-robustness; Single channel;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	P. C. Loizou, Speech Enhancement (CRC Press, Boca Raton, 2007), pp. 309-400.
2	J. Sohn, N. S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Signal Process. Lett. 16, 1-3 (1999).
3	ITU, A silence compression scheme for G.729 optimized for terminals conforming to recommendation V.70, ITU-T Recommendation G.729-Annex B (1996).
4	ETSI EN 301 708 V7.1.1(1999-12), Digital cellular telecommunications system(Phase 2+); VAD for AMR speech traffic channels; General Description (GSM 06.94 version 7.1.1 Release 1998), 13-14 (1999).
5	ETSI ES 202 050, Ver. 1.1.5(2007-01), Speech Processing, Transmission and Quality Aspects(STQ); Distributed Speech Recognition; Advanced front-end feature extraction algorithm; Compression algorithms, Annex A.3 Stage 2-VAD Logic, 42-43 (2007).
6	J. Ramirez, J. C. Segura, C. Benitez, A. Torre, and A. Rubio, "Efficient voice activity detection algorithms using longterm speech information," Speech Commun. 42, 271-287 (2004). DOI ScienceOn
7	A. Davis, S. Nordholm, and R. Togneri, "Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold," IEEE Trans. Audio, Speech and Lang. Processing 14, 412-414 (2006). DOI ScienceOn
8	G. Evangelopoulos and P. Maragos, "Multiband modulation energy tracking for noisy speech detection," IEEE Trans. Audio, Speech and Lang. Processing 14, 2024-2038 (2006). DOI ScienceOn
9	T. V. Pham and T. T. Chien, "Reliable voice activity detection algorithm under adverse environments," in Proc. IEEE Int. Conf. Commun. Electronics, 218-223 (2008).
10	P. K. Ghosh and S. Narayanan, "Robust voice activity detection using long-term signal variability," IEEE Trans. Audio, Speech and Lang. Processing 19, 600-613 (2011). DOI ScienceOn
11	E. Chuangsuwanich and J. Glass, "Robust voice activity detector for real world application using harmonicity and modulation frequency," in Proc. Interspeech, 2645-2648 (2011).
12	B. Koo, "A single channel voice activity detection for noisy environments using wavelet packet decomposition and Teager energy" (in Korean), J. Acoust. Soc. Kr. 33, 139-145 (2014). DOI ScienceOn
13	J. Garofolo, "TIMIT acoustic-phonetic continuous speech corpus," LDC93S1, Linguistic Data Consortium, Philadelphia, 1993.
14	A. Varga and H. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: An additive noise on speech recognition systems," Speech Commun. 12, 247-251 (1993). DOI ScienceOn

KSCI

A Parametric Voice Activity Detection Based on the SPD-TE for Nonstationary Noises 비정체성 잡음을 위한 SPD-TE 기반 계수형 음성 활동 탐지

A Parametric Voice Activity Detection Based on the SPD-TE for Nonstationary Noises