스마트폰 음성 통신용 음성 검출 기술

  • Published : 2012.03.30

Abstract

본고에서는 스마트폰 환경에서 음성 통신에 필요한 가변 전송률 음성 부호화기를 위한 음성 검출 기술을 알아본다. 소개할 음성 검출 기술은 통계적 모델(statistical model)을 기반으로 한 우도비 테스트(likelihood ratio test, LRT)를 이용하여 음성 존재 여부를 판단하는 결정법을 유도한다. 이후 통계적 모델을 기반으로 한 음성 검출 방법의 신뢰도를 높이기 위해 새로운 방법들이 연구되었으며 최근까지 연구가 진행 중인 통계적 모델 기반의 음성 검출 방법을 소개한다.

Keywords

References

  1. Y. Gao, E. Shlomot, A Benyassine, J. Thyssen, Huanyu Su, and C. Murgia, "The SMV Algorithm Selected by TIA and 3GPP2 for CDMA Applications," Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 709-712, May 2001.
  2. 3GPP2 Spec., "Source-controlled variable-rate multimedia wideband speech codec (VMR-WB), service option 62 and 63 for spread spectrum systems," 3GPP2- C.S0052-A, v.1.0, Apr. 2005.
  3. L. R. Rabiner and M. R. Sambur, "Voiced-unvoicedsilence detection using Itakura LPC distance measure," Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, May 1977, PP. 323-326.
  4. J. D. Hoyt and H. Wechsler, "Detection of human peech in structured noise," Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, May 1994, PP. 237-240.
  5. J. C. Junqua, B. Reaves, and B. Mark, "A study of endpoint detection algorithms in adverse conditions: Incidence on a DTW and HMM recognize," Proc. Eurospeech, 1991, pp.1371-1374.
  6. J. A. Haigh and J. S. Mason, "Robust voice activity detection using cepstral feature," Proc. IEEE TELCON, China, 1993, PP. 321-324.
  7. N. B. Yoma, F. McIness, and M. Jack, "Robust speech pulse-detection using adaptive noise modeling," Electronics Letters, Vol. 32, Jul. 1996, PP. 1350-1352. https://doi.org/10.1049/el:19960892
  8. R. Tucker, "Voice activity detection using a periodicity measure," Proc Inst. Elec. Eng., Vol. 139, Aug. 1992, PP. 377-380.
  9. ITU-T Rec. G.729, Annex B, A silence compression scheme for G.729 optimized for terminals conforming to ITU-T V.70.
  10. ETSI, "Voice activity detector (VAD) for adaptive multi-rate (AMR) speech traffic channels," ETSI EN 301 708 v7. 1.1, Dec. 1999.
  11. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator," IEEE Trans. Acoustics, Speech, Sig. Process., vol. ASSP-32, no. 6, PP. 1190-1121, Dec. 1984.
  12. J. Sohn, N. S. Kim, and W. Sung, "A statistical model-based voice activity detection," IEEE Sig. Process. Lett., vol. 6, no. 1, pp. 1-3, Jan. 1999.
  13. J. W. Shin, H. J. Kwon, S. H. Jin and N. S. Kim, "Voice activity detection based on conditional MAP criterion," IEEE Signal Processing Letters, vol. 15, PP. 257-260, Feb. 2008.
  14. 3GPP2 Spec., "Enhanced Variable Rate Codec (EVRC)" 3Gpp2- C.S0014-0, vol. 1.0 Apr. 2004.