A Study on Combining Bimodal Sensors for Robust Speech Recognition

;;;

The Journal of the Acoustical Society of Korea (한국음향학회지)

Volume 20 Issue 6
/
Pages.51-56
/
2001
/
1225-4428(pISSN)
/
2287-3775(eISSN)

The Acoustical Society of Korea (한국음향학회)

A Study on Combining Bimodal Sensors for Robust Speech Recognition

강인한 음성인식을 위한 이중모드 센서의 결합방식에 관한 연구

이철우 (홍익대학교 전자공학과) ;
계영철 (홍익대학교 전자공학과) ;
고인선 (홍익대학교 전자공학과)

Published : 2001.08.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

Recent researches have been focusing on jointly using lip motions and speech for reliable speech recognitions in noisy environments. To this end, this paper proposes the method of combining the visual speech recognizer and the conventional speech recognizer with each output properly weighted. In particular, we propose the method of autonomously determining the weights, depending on the amounts of noise in the speech. The correlations between adjacent speech samples and the residual errors of the LPC analysis are used for this determination. Simulation results show that the speech recognizer combined in this way provides the recognition performance of 83 ％ even in severely noisy environments.

최근 잡음이 심한 환경에서 음성인식을 신뢰성있게 하기 위하여 입모양의 움직임과 음성을 같이 사용하는 방법이 활발히 연구되고 있다 본 논문에서도 이러한 목적으로 영상언어인식기와 음성인식기의 결과에 각각 가중치를 주어 결합하는 방법을 제안한다. 특히 가중치를 입력음성의 잡음의 정도에 따라 자동적으로 결정하는 방법을 제안한다. 가중치의 결정을 위하여 입력샘플간의 상관도와 LPC분석의 잔여 오차를 이용한다. 모의실험 결과, 이런 방식으로 결합된 인식기는 잡음이 심한 환경에서도 약 83%의 인식성능을 보이고 있다.

Keywords

References

Proceedings of the IEEE v.86 no.5 Toward Multimodal Human-Computer Interface Rajeev Sharma;Vladimir I. Pavlvoic;Thomas S.Huang
Proc. of the IEEE. A syntactic approach to automatic lip feature extraction for speaker identification T.Wark;S. Sridharan
Proceedings of ICPR'96 Locating and Tracking Facial Speech Features Juergen Luettin;Neil A. Thacker;S.W. Beet
Asilomar Conference, signal, systems and computer v.1 Lip modeling for Visual speech recognition Ram R. Rao;Russell M. Mersereau
Digital Processing of Speech Signals L. R. Rabiner;R. W. Schafer
Fundamentals of speech recognition Lawrence Rabiner;Biing-Hwang Juang
ICASSP Fusion of visual and acoustic signals for command-word recongnition Rudolf Kover;Uirich Harz;Jutta Schiffers
IEICE TRANS. FUNDAMENTALS v.E80-A no.8 An lsolated Word Speech Recognition Based on Fusion of Visual and Auditory Information Using 30-frame/s and 24-bit Color Image Akio Ogihara;Shinobu Asao

The Journal of the Acoustical Society of Korea (한국음향학회지)

A Study on Combining Bimodal Sensors for Robust Speech Recognition

강인한 음성인식을 위한 이중모드 센서의 결합방식에 관한 연구

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)