Browse > Article

Effective Feature Vector for Isolated-Word Recognizer using Vocal Cord Signal  

Jung, Young-Giu (한국전자통신연구원 스마트인터페이스연구팀)
Han, Mun-Sung (한국전자통신연구원 스마트인터페이스연구팀)
Lee, Sang-Jo (경북대학교 컴퓨터공학과)
Abstract
In this paper, we develop a speech recognition system using a throat microphone. The use of this kind of microphone minimizes the impact of environmental noise. However, because of the absence of high frequencies and the partially loss of formant frequencies, previous systems developed with those devices have shown a lower recognition rate than systems which use standard microphone signals. This problem has led to researchers using throat microphone signals as supplementary data sources supporting standard microphone signals. In this paper, we present a high performance ASR system which we developed using only a throat microphone by taking advantage of Korean Phonological Feature Theory and a detailed throat signal analysis. Analyzing the spectrum and the result of FFT of the throat microphone signal, we find that the conventional MFCC feature vector that uses a critical pass filter does not characterize the throat microphone signals well. We also describe the conditions of the feature extraction algorithm which make it best suited for throat microphone signal analysis. The conditions involve (1) a sensitive band-pass filter and (2) use of feature vector which is suitable for voice/non-voice classification. We experimentally show that the ZCPA algorithm designed to meet these conditions improves the recognizer's performance by approximately 16%. And we find that an additional noise-canceling algorithm such as RAST A results in 2% more performance improvement.
Keywords
Throat microphone; Throat microphone signal analysis; Isolated-word recognition system; Korean Phonological Feature Theory; ZCPA; MFCC;
Citations & Related Records
연도 인용수 순위
  • Reference
1 정경일 외, 한국어의 탐구와 이해, 박이정출판사, 2000
2 신지영, 차재은, 우리말 소리의 체계:국어 음운론 연구의 기초를 위하여, 한국문화사, 2003
3 C. K. Un and S. C. Yang, 'A Pitch extraction algorithm based on LPC inverse filtering and AMDF,' IEEE Trans. Acoust., Speech Signal Processing, ASSP-25, 565-572, Dec. 1997
4 Doh-Suk Kim, Soo-Young Lee, Rhee M. Kil 'Auditory Processing of Speech Signals for Robust Speech Recognigion in Real-Word Noisy Environments,' IEEE Tran. Speech and Audio Processing, vol., 7 No.1, Jan., 1999
5 H. Hermansky and N. Morgan, 'Rasta processing of speech,' IEEE Trans. Speech Audio Processing, vol. 2, pp. 578-589, Oct. 1994   DOI   ScienceOn
6 이연철, 이상운, 홍훈섭, 한문성, 마평수, '넥마이크로 입력된 음성 신호에 대한 인식 연구', 제 18회 한국정보처리학회, 제9권 제2호, 2002   과학기술학회마을
7 Donghoon Hyun, Chulhee Lee, 'Optimization of mel-ceptrum for speech recognition,' IEEE SMC'99 Conference Proceeding Volume 1, pp.500-503, Oct. 1999
8 O. Ghitza, 'Auditory models and human performances in tasks related to speech coding and speech recognition,' IEEE Trans. Speech and Audio Processing, vol. 2, no, 1, part II, pp. 115-132, 1994   DOI   ScienceOn
9 구현옥, 국어 음운학의 이해, 한국문화사, 1999
10 M. Graciarena. H. Franco, K. Sonmez, H Bratt, 'Combining Standard and Throat Microphones for Robust speech Recognition,' in IEEE Signal Processing Letters, Vol. 10, No. 3, pp. 72-74, March 2003   DOI   ScienceOn
11 S. Dupont, C. Ris, 2004, 'Combined use of close-talk and throat microphones for improved speech recognition under non-stationary background noise,' proc. of Robust 2004 (Workshop(ITRW) on Robustness Issues in Conversational Interaction), Norwich, Aug. 2004
12 Nakajima. Y, Kashioka. H, Shikano. K and Campbel. N, 'Non-audible murmur recognition input interface using stethoscopic microphone attached to the skin,' ICASSP'03, vloume 5, pp.708-11, 2003
13 S. C. Jou, T. Schultz, and A. Waibel, 'Adaptation for Soft Whisper Recognition Using a Throat Microphone,' in Proc. ICSLP, Jeju Island, Korea, Oct 2004
14 Zhengyoun Zhang, Zicheng Liu, Sinclair. M, Acero. A, Li Deng, Droppo, J, Xuedong Huang. Yanli Zheng, 'Multi-sensory microphones for robust speech detection, enhancement and recognition,' ICASSP'04, page: iii-781-4 vol.3, May 2004   DOI
15 Y. Ephraim and D. Malah, 'Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,' IEEE Trans. Acoust., Speech, Signal Processing, 33, 443-445, Apr. 1985   DOI
16 S. F. Boll, 'Suppression of acoustic noise speech using spectral subtraction,' IEEE Trans. Acoust., Speech, Signal Processing, ASSP-27, 113-120, Apr., 1979   DOI
17 R. J. McAulay and M. L. Malpass, 'Speech enhancement using a soft-desision noise suppression filter,' IEEE Trans. Acoust., Speech, Signal Processing, 28, 137-145, Apr. 1980   DOI