Statistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision

Chang, Joon-Hyuk;

doi:10.4218/etrij.12.0111.0344

ETRI Journal

Volume 34 Issue 2
/
Pages.184-189
/
2012
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Statistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision

Chang, Joon-Hyuk (Department of Electronic Engineering, Hanyang University)

Received : 2011.06.01
Accepted : 2011.10.19
Published : 2012.04.04

https://doi.org/10.4218/etrij.12.0111.0344 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we propose a novel approach to statistical model-based voice activity detection (VAD) that incorporates a second-order conditional maximum a posteriori (CMAP) criterion. As a technical improvement for the first-order CMAP criterion in [1], we consider both the current observation and the voice activity decision in the previous two frames to take full consideration of the interframe correlation of voice activity. This is clearly different from the previous approach [1] in that we employ the voice activity decisions in the second-order (previous two frames) CMAP, which has quadruple thresholds with an additional degree of freedom, rather than the first-order (previous single frame). Also, a soft-decision scheme is incorporated, resulting in time-varying thresholds for further performance improvement. Experimental results show that the proposed algorithm outperforms the conventional CMAP-based VAD technique under various experimental conditions.

Keywords

References

J.W. Shin et al., "Voice Activity Detection Based on Conditional MAP Criterion," IEEE Signal Proc. Lett., vol. 15, Feb. 2008, pp. 257-260. https://doi.org/10.1109/LSP.2008.917027
L.R. Rabiner and M.R. Sambur, "Voiced-Unvoiced-Silence Detection Using the Itakura LPC Distance Measure," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Process., May 1977, pp. 323- 326.
J.A. Haigh and J.S. Mason, "Robust Voice Activity Detection Using Cepstral Features," Proc. IEEE TENCON, vol. 3, Oct. 1993, pp. 321-324.
K. Srinivasant and A. Gersho, "Voice Activity Detection for Cellular Networks," Proc. IEEE Works. Speech Coding Telecommu., Oct. 1993, pp. 85-86.
Y. Ephraim and D. Malah, "Speech Enhancement Using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator," IEEE Trans. Acoustics, Speech, Signal Process., vol. ASSP-32, no. 6, Dec. 1984, pp. 1109-1121.
Y.D. Cho, K. Al-Naimi, and A. Kondoz, "Improved Voice Activity Detection Based on a Smoothed Statistical Likelihood Ratio," Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Process., vol. 2, May 2001, pp. 737-740.
J. Sohn, N.S. Kim, and W. Sung, "A Statistical Model-Based Voice Activity Detection," IEEE Signal Proc. Lett., vol. 6, no. 1, Jan. 1999, pp. 1-3.
J.-H. Chang, N.S. Kim, and S.K. Mitra, "Voice Activity Detection Based on Multiple Statistical Models," IEEE Trans. Signal Process., vol. 54, no. 6, June 2006, pp. 1965-1976. https://doi.org/10.1109/TSP.2006.874403
J. Ramirez et al, "Statistical Voice Activity Detection Using a Multiple Observation Likelihood Ratio Test," IEEE Signal Process. Lett., vol. 12, no. 10, Oct. 2005, pp. 689-692. https://doi.org/10.1109/LSP.2005.855551
J.-H. Chang, J.W. Shin, and N.S. Kim, "Likelihood Ratio Test with Complex Laplacian Model for Voice Activity Detection," Proc. Eurospeech, Aug. 2003, pp. 1065-1068.
J.-H. Chang et al., "Global Soft Decision Employing Support Vector Machine for Speech Enhancement," IEEE Signal Proc. Lett., vol. 16, no. 1, Jan. 2009, pp. 57-60. https://doi.org/10.1109/LSP.2008.2008574
P.C. Loizou, Speech Enhancement: Theory and Practice, CRC Press, 2007.
ITU-T, "A Silence Compression Scheme for G.729 Optimised for Terminals Conforming to Recommendation V.70," ITU-T Rec. G.729, Annex B, 1996.

Cited by

iVisher: Real-Time Detection of Caller ID Spoofing vol.36, pp.5, 2014, https://doi.org/10.4218/etrij.14.0113.0798

ETRI Journal

Statistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)