DOI QR코드

DOI QR Code

GMM을 이용한 응급 단어와 비응급 단어의 검출 및 인식 기법

Detection and Recognition Method for Emergency and Non-emergency Speech by Gaussian Mixture Model

  • 조영임 (수원대학교 컴퓨터학과) ;
  • 이대종 (충북대학교 전기전자컴퓨터공학부)
  • 투고 : 2011.03.02
  • 심사 : 2011.04.04
  • 발행 : 2011.04.25

초록

일반적으로 어떤 순간에 발생할지 모르는 응급 상황을 CCTV의 영상 정보만으로 상황을 항상 모니터링하기에는 인력과 비용의문제점이 발생되고 있다. 본 논문에서는 응급상황을 동적으로 보여주는 CCTV환경에서 감지하기 위해 GMM을 이용한 응급단어와 비응급단어의 검출 및 인식기법을제안하고자 한다. 제안된 방법은 Global GMM 모델에 의해 응급단어와 일반단어를 검출하고 이 모델에 의해 응급단어라 판정된 경우에는 Local GMM 모델에 응급단어 인식을 수행하게 된다. 제안된 방법은 다양한 환경하에서 취득한 응급단어와 일반단어에 대해 적용하여 타당성을 검증하였다.

For the emergency detecting in general CCTV environment of our daily life, the monitoring by only images through CCTV information occurs some problems especially in cost as well as man power. Therefore, in this paper, for detecting emergency state dynamically through CCTV as well as resolving some problems, we propose a detection and recognition method for emergency and non-emergency speech by GMM. The proposed method determine whether input speech is emergency or non-emergency speech by global GMM. If emergeny speech, local GMM is performed to classify the type of emergency speech. The proposed method is tested and verified by emergency and non-emergency speeches in various environmental conditions.

키워드

참고문헌

  1. 유장희, 문기영, 조현숙, “지능형 영상보안 기술현황 및 동향,” 전자통신동햑분석, 제23권, 4호, pp. 80-89, 2008.
  2. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans., ASSP, vol. 37, no. 2, pp. 113-120, 1979.
  3. Doclo, S., Rong Dong, Klasen, T.J., Wouters, J., Haykin, S., Moonen, M., "Extension of the multi-channel Wiener filter with ITD cues for noise reduction in binaural hearing aids," Applications of Signal Processing to Audio and Acoustics, vol. 16, no. 16, pp 70-73, 2005.
  4. Y. Ephraim and D. Malah, "Speech enhancement using a minimum mean square error short-time spectral amplitude estimator," in Proc. ICASSP. vol. ASSP-32, no. 6, pp. 1109-1121, 1984.
  5. R. Martin, "Spectral subtraction based on minimum statistics," in Proc. EISOPCO. pp. 1182-1185, 1994.
  6. R. Marin, "Noise power spectral density estimation based on optimal smoothing and minimum statistics," IEEE Trans, Speech Audio Process, vol. 9, no. 5, pp. 504-512, 2001. https://doi.org/10.1109/89.928915
  7. J. S. Erkelens and Richard Heusdens, "Tracking of nonstationary noise based on data-driven recursive noise power estimation," IEEE Trans. Audio, Speech and Language Processing, vol. 16, no. 6, pp. 1112-1123. 2008. https://doi.org/10.1109/TASL.2008.2001108
  8. Rabiner and Sambur, "An algorithm for determining the endpoints of isolated utterances," The bell system technical journal, vol. 54, no. 2, pp. 297-315, 1975, https://doi.org/10.1002/j.1538-7305.1975.tb02840.x
  9. Ethem Alpayd, "Soft vector quantization and the EM algorithm," Neural Networks, vol. 11, pp. 467-477, Issue 3, 1998. https://doi.org/10.1016/S0893-6080(97)00147-0
  10. P. Dhanalakshmi., S. Palanivel, V. Ramalingam, “Classification of audio signals using AANN and GMM,” Applied soft computing, vol. 11, No. 1, pp. 716-723,2011. https://doi.org/10.1016/j.asoc.2009.12.033
  11. http://www.sitec.or.kr