Browse > Article
http://dx.doi.org/10.13067/JKIECS.2015.10.8.901

An Efficient Voice Activity Detection Method using Bi-Level HMM  

Jang, Guang-Woo (광운대학교 로봇학부)
Jeong, Mun-Ho (광운대학교 로봇학부)
Publication Information
The Journal of the Korea institute of electronic communication sciences / v.10, no.8, 2015 , pp. 901-906 More about this Journal
Abstract
We presented a method for Vad(Voice Activity Detection) using Bi-level HMM. Conventional methods need to do an additional post processing or set rule-based delayed frames. To cope with the problem, we applied to VAD a Bi-level HMM that has an inserted state layer into a typical HMM. And we used posterior ratio of voice states to detect voice period. Considering MFCCs(: Mel-Frequency Cepstral Coefficients) as observation vectors, we performed some experiments with voice data of different SNRs and achieved satisfactory results compared with well-known methods.
Keywords
Voice Activity Detection; Bi-Level HMM; Posterior Ratio; MFCC;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Y. Zhang, Z. Tang, Y. Li, and Y. Luo, "A hierarchical framework approach for voice activity detection and speech enhancement," The Scientific World J., vol. 2014, 2014, pp. 1-8.
2 J. Choi, "Speech and Noise Recognition System by Neural Network," The J. of Korea Institute of Electronic Communication Science, vol. 5, no. 4, 2010, pp. 357-362.
3 J. Choi, "Subband Based Spectrum Subtraction Algorithm" The J. of Korea Institute of Electronic Communication Science, vol. 8, no. 4, 2013, pp. 555-560.   DOI
4 J. Choi, "Voiced-Unvoiced-Silence Detection Algorithm using Perceptron Neural Network," The J. of Korea Institute of Electronic Communication Science, vol. 6, no 2, 2011, pp. 237-242.
5 C. Lee and D. Kim, "Adaptive Noise Reduction of Speech Using Wavelet Transform," The J. of Korea Institute of Electronic Communication Science, vol. 4, no. 3, 2009, pp. 190-196.
6 J. Ramirez, J. C. Segura, C. Benitez, L. Garcia, and A. Rubio, "Statistical Voice Detection using a Multiple Observation Likelihood Ratio Test," IEEE Signal Proc. Letters, vol. 12, no. 10, 2005, pp. 689-692.   DOI   ScienceOn
7 J. Sohn, N.-S. Kim, and W. Sung, "A statistical model-based voice activity detection[J]," Signal Proc. Letters, IEEE, vol. 6, no. 1, 1999, pp. 1-3.
8 H. Veisi and H. Sameti, "Hidden Markov Model-based Voice Activity Detector with High Speech Detection Rate for Speech Enhancement," IET Signal Proc., vol. 6, no. 3, 2010, pp. 54-63.
9 H. Othman and T. Aboulnasr, "A Semi-Continuous State-Transition Probability HMM-Based Voice Activity Detector," EURASIP J. on Audio, Speech, and Music Proc., vol. 2007, 2007, pp. 1-7.
10 X. Liu, Y. Liang, Y. Lou, H. Li, and B. Shan, "Noise-Robust Voice Activity Detector Based on Hidden Semi-Markov Models," Int. Conf. on Pattern Recognition, Istanbul, Turkey, August 2010, pp. 81-84.
11 A. Benyassine, E. Shlomot, H. Y. Su, D. Massaloux, C. Lamblin, and J. P. Petit, "ITU-T Recommendation G.729-Annex B. A silence compression scheme for G.729 optimized for terminals conforming to recommendation V.70," IEEE Communication Mag., Sept. 1997, pp. 64-70.
12 D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker Verification using Adapted Gaussian Mixture Models," Digital Signal Processing, vol. 10, 2000, pp. 19-41.   DOI
13 S. Chen, R. C. Guido, T. Truong, and Y. Chang, "Improbed Voice Activity Detection Algorithm using Wavelet and Support Vector Machine," Computer Speech and Language, vol. 24, no. 3, 2010, pp. 531-543.   DOI
14 P. Tiawongsombat, M. Jeong, J. Yun, B. You, and S. Oh, "Robust visual speakingness detection using bi-level HMM," Pattern Recognition, vol. 45, no. 2, 2012, pp. 783-793.   DOI
15 S. Skorik and F. Berthommier, "On a cepstrum-based speech detector robust to white noise," Computing Research Repository, vol. cs.CL/00100014, 2000, pp. 1-4.