Browse > Article

Statistical Voice Activity Defector Based on Signal Subspace Model  

Ryu, Kwang-Chun (전남대학교 전자컴퓨터공학과)
Kim, Dong-Kook (전남대학교 전자컴퓨터공학과)
Abstract
Voice activity detectors (VAD) are important in wireless communication and speech signal processing, In the conventional VAD methods, an expression for the likelihood ratio test (LRT) based on statistical models is derived in discrete Fourier transform (DFT) domain, Then, speech or noise is decided by comparing the value of the expression with a threshold, This paper presents a new statistical VAD method based on a signal subspace approach, The probabilistic principal component analysis (PPCA) is employed to obtain a signal subspace model that incorporates probabilistic model of noisy signal to the signal subspace method, The proposed approach provides a novel decision rule based on LRT in the signal subspace domain, Experimental results show that the proposed signal subspace model based VAD method outperforms those based on the widely used Gaussian distribution in DFT domain.
Keywords
Voice Activity Detection; Gaussian Distribution; Likelihood Ratio Test;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 N. S. Kim, and J. -H. Chang, "Spectral Enhancement Based on Global Soft Decision," IEEE Signal Process. Lett,, 7(5), 108-110, 2000
2 J. -H. Chang, N. S. Kim and S. K. Mitra, "Voice Activity Detection Based on Multiple Statistical Models," IEEE Trans. Signal Proc., 54(6), 1965-1976, June 2006
3 강상익, 조규행, 박승섭, 장준혁, "통계적 모델 기반의 음성 검출 기를 위한 변별적 가충치 학습," 한국음향학회지, 26(5), 194- 198, 2007년 7월   과학기술학회마을
4 S. Roweis,"EM Algorithms for PCA and SPCA,"Neural Inform. Process. System, 10, 626-632, 1997
5 S. Gazor and W. Zhang, "A Soft Voice Activity Detector Based on a Laplacian-Gaussian Model," IEEE Trans. Speech and Audio Proc., 11(5), 498-505, Sept. 2003
6 K. Hermus, P. Wambacq, and H. V. Hamme, "A Review of Signal Subspace Speech Enhancement and Its Application to Noise Robust Speech Recognition," EURASIP Journal on Advances in Signal Procesing, 2007, Article D 45821, 15 pages, 2007
7 J. -H. Chang, J. W. Shin and N. S. Kim "Voice Activity Detector Employing Generalized Gaussian Distribution," IEEE Electronics Lett. 40(24), 1561-1563, Nov. 2004
8 F. Jabloun and B. Champagne, "Incorporating the Human Hearing Properties in the Signal Subspace Approach for Speech Enhancement," IEEE Trans. Speech and Audio Proc., 11(6), 700-708, Nov. 2003
9 A. P. Dempster, N. M. Laird and D. B. Rubin, "Maximum likelihood from incomplete data via the EM algorithm," Journal of the Royal Statistical Society, 39, 1-38, 1977
10 J. S. Sohn, N. S. Kim and W. Y. Sung, "A stistical Model -Based Voice Activity Detection," IEEE Signal pocessing Lett., 6(1), 1-3, Jan. 1999
11 Y. Ephraim and D. Malah, "Speech Enhancement Using A Minimum Mean-square Error Short-time Spectral Amplitude Estimator," IEEE Trans. Acoust., Speech, Signal Proc., ASSP -32, 1109-1121, Dec. 1984
12 A.Varga and H.J.M. Steeneken, "Assessment for Automatic Speech Recognition: II.NOISEX-92: A Database and An Experiment to Study The Effect of Additive Noise on Speech Recognition Systems," Speech Communication, 12(3), 247- 251, Jul.1993   DOI   ScienceOn
13 M. Tipping and C. Bishop, "Mixtures of probabilistic principal component analyzers," Neural Computation, 11, 435-474, 1999
14 장근원, 장준혁, 김동국, "UMP 테스트에 근거한 새로운 통계적 음성검출기," 한국음향학회지,26(1), 16-24, 2007년 1월   과학기술학회마을
15 A. Dvis, S. Nordholm and R. Togneri, "Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold," IEEE Trans. Audio, Speech, and Language Processing, 14(2), 412-424, March 2006
16 P. Loizou, Speech Enhancement : Theory and Practice, CRC Press. 2007
17 Y. Ephraim and H. L. Van Tress, "A Signal Subspace Approach for Speech Enhancement," IEEE Trans. Speech and Audio Proc., 3(4), 251-266, July 1995   DOI   ScienceOn