[KSCI] Korea Science Citation Index Service

Robust Endpoint Detection for Bimodal System in Noisy Environments

오현화 (경북대학교 전자전기공학부)
권홍석 (경북대학교 전자전기공학부)
손종목 (경북대학교 전자전기공학부)
진성일 (경북대학교 전자전기공학부)
배건성 (l경북대학교 전자전기공학부)

Publication Information

Journal of the Institute of Electronics Engineers of Korea CI / v.40, no.5, 2003 , pp. 289-297 More about this Journal

Abstract

The performance of a bimodal system is affected by the accuracy of the endpoint detection from the input signal as well as the performance of the speech recognition or lipreading system. In this paper, we propose the endpoint detection method which detects the endpoints from the audio and video signal respectively and utilizes the signal to-noise ratio (SNR) estimated from the input audio signal to select the reliable endpoints to the acoustic noise. In other words, the endpoints are detected from the audio signal under the high SNR and from the video signal under the low SNR. Experimental results show that the bimodal system using the proposed endpoint detector achieves satisfactory recognition rates, especially when the acoustic environment is quite noisy.

Keywords

끝점검출;바이모달 시스템;입술독해;음성/영상 데이터베이스;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	H. Kaplan, C.J. Bally, and C. Garretson, Speechreading: A Way to Improve Understanding, Gallaudet University Press, Washington D.C., 1999
2	M.E. Hennecks, K.V. Prasad, and D.G. Stork, 'Automatic Speech Recognition System Using Acoustic and Visual Signals,' in Proc. of 29th Asilomar Conf. on Signals, Systems and Computers, vol. 2, pp. 1214-1218, 1995 DOI
3	L.R. Rabiner and M.R. Sambur, 'An Algorithm for Determining the Endpoints of Isolated Uttrances,' Bell Syst. Tech. J., vol. 54, no. 2, pp. 297-315, 1975 DOI
4	B. Dodd and R. Campbell, Hearing by Eye: The Psychology of Lip-reading, Lawrence Erbaum Press, Hillsdale NJ, 1987
5	C. Bregler and Y. Konig, 'Eigenlips for Robust Speech Recognition,' in Proc. of IEEE Int'l Conf. on Acoustics, Speech and Signal Processing, vol. 2, pp. 669-672, 1994 DOI
6	G.S. Ying, C.D. Mitchell, and L.H. Jamieson, 'Endpoing Detection of Isolated Utterances Based on a Modified Teager Energy Measurement,' in Proc. of IEEE Int'l Conf. on Acoustics, Speech and Signal Processing, pp. 732-735, 1993
7	박병구, 김진영, 최승호, '바이모달 음성인식의 음성정보와 입술정보 결합방법 비교,' 한국음향학회지, 제18권 제4호, pp. 31-37, 1999 과학기술학회마을
8	S. Dupont and J. Luettin, 'Audio-Visual Speech Modeling for Continuous Speech Recognition,' IEEE Trans. on Multimedia, vol. 2, no. 3, pp. 141-151, 2000 DOI ScienceOn
9	L.F. Lamel, L.R. Rabiner, A.E. Rosenberg, and J.G. Wilpon, 'An Improved Endpoint Detector for Isolated Word Recognition,' IEEE Trans. Acoust., Speech, and Signal Processing, vol. 29, no. 4, pp. 777-785, 1981 DOI
10	Y. Ephraim and D. Malah, 'Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimator,' IEEE Trans. on Acoustic, Speech and Signal Processing, vol. ASSP-2, no. 6, pp. 1109-1121, 1984 DOI
11	H.-S. Kwon, J.-M. Son, S.-Y. Jung, and K.-S. Bae, 'Speech Enhancement Using Microphone Array with MMSE-STSA Based Post-Processing,' in Proc. of Int'l Conf. on Electronics, Information and Communications, pp. 186-189, Ulaanbaatar, Mongolia, Jul. 2002
12	H.-H. Oh, Y.-M. Jeoun, and S.-I. Chien, 'A Set of Mesh Features for Automatic Visual Speech Recognition,' in Proc. of IARP Workshop on Machine Vision Applications, pp. 488-491, Nara, Japan, Dec. 2002
13	S. Bou-Ghazale and K. Assaleh, 'A Robust Endpoint Detection of Speech for Noisy Environments with Application to Automatic Speech Recognition,' Proc. IEEE Int'l Conf. On Acoustics, Speech and Signal Processing, pp. IV-3808-IV-3811, Orlando, Florida, May 2002 DOI

KSCI

Robust Endpoint Detection for Bimodal System in Noisy Environments 잡음환경에서의 바이모달 시스템을 위한 견실한 끝점검출

Robust Endpoint Detection for Bimodal System in Noisy Environments