Browse > Article
http://dx.doi.org/10.9717/kmms.2014.17.9.1064

Voice Activity Detection based on Adaptive Band-Partitioning using the Likelihood Ratio  

Kim, Sang-Kyun (Division of Electronic Engineering, Inha Univ.)
Shim, Hyeon-Min (Division of Electronic Engineering, Inha Univ.)
Lee, Sangmin (Division of Electronic Engineering, Inha Univ.)
Publication Information
Abstract
In this paper, we propose a novel approach to improve the performance of a voice activity detection(VAD) which is based on the adaptive band-partitioning with the likelihood ratio(LR). The previous method based on the adaptive band-partitioning use the weights that are derived from the variance of the spectral. In our VAD algorithm, the weights are derived from LR, and then the weights are incorporated with the entropy. The proposed algorithm discriminates the voice activity by comparing the weighted entropy with the adaptive threshold. Experimental results show that the proposed algorithm yields better results compared to the conventional VAD algorithms. Especially, the proposed algorithm shows superior improvement in non-stationary noise environments.
Keywords
Voice Activity Detection; Adaptive Band-Partitioning; Likelihood Ratio;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Y.D. Cho, K. Al-Naimi, and A. Kondoz, "Improved Voice Activity Detection based on a Smoothed Statistical Likelihood Ratio," Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, pp. 7-11, 2001.
2 J.H. Song and S.M. Lee, "Voice Activity Detection based on Generalized Normal-Laplace Distribution Incorporating Conditional MAP," IEICE Transactions on Information and Systems, Vol. E96-D, No. 12, pp. 2888-2891, 2013.   DOI
3 Y.S. Park and S. Lee, "Voice Activity Detection using Global Speech Absence Probability based on Teager Energy for Speech Enhancement," IEICE Transactions on Information and Systems, Vol. E95-D, No. 10, pp. 2568-2571, 2012.   DOI
4 ITU-T Rec. G.729, Annex B, A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to ITU-T V.70, 1996.
5 J. Sohn, N.S. Kim, and W. Sung, "A Statistical Model-based Voice Activity Detection," IEEE Signal Processing Letters, Vol. 6, No. 1, pp. 1-3, 1999.
6 Y. Ephraim and D. Malah, "Speech Enhancement using a Minimum Mean-square Error Short-time Spectral Amplitude Estimator," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-32, No. 6, pp. 1190-1121, 1984.
7 B.F. Wu and K.C. Wang, "Robust Endpoint Detection Algorithm based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 13, No. 5, pp. 762-775, 2005.
8 G.H. Lee, Y.J. Lee, J.H. Cho and M.N. Kim, "Voice Activity Detection Algorithm using Fuzzy Membership Shifted C-means Clustering in Low SNR Environment," Journal of Korea Multimedia Society, Vol. 17, No. 3, pp. 312-323, 2014.   과학기술학회마을   DOI
9 Y. Gao, E. Shlomot, A. Benyassine, J. Thyssen, Huan-yu Su, and C. Murgia, "The SMV Algorithm Selected by TIA and 3GPP2 for CDMA Applications," Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, pp. 709-712, 2001.
10 3GPP2 Spec., Source-controlled Variablerate Multimedia Wideband Speech Codec (VMR-WB), Service Option 62 and 63 for Spread Spectrum Systems, 3GPP2-C.S0052-A, v.1.0, 2005.