Browse > Article

Implementation of HMM Based Speech Recognizer with Medium Vocabulary Size Using TMS320C6201 DSP  

Jung, Sung-Yun (Telecom Examination Div., The Korean Intellectual Property Office)
Son, Jong-Mok (Application Technology Research Department, National Security Research Institute)
Bae, Keun-Sung (School of Electronic and Electrical Engineering, Kyungpook National University)
Abstract
In this paper, we focused on the real time implementation of a speech recognition system with medium size of vocabulary considering its application to a mobile phone. First, we developed the PC based variable vocabulary word recognizer having the size of program memory and total acoustic models as small as possible. To reduce the memory size of acoustic models, linear discriminant analysis and phonetic tied mixture were applied in the feature selection process and training HMMs, respectively. In addition, state based Gaussian selection method with the real time cepstral normalization was used for reduction of computational load and robust recognition. Then, we verified the real-time operation of the implemented recognition system on the TMS320C6201 EVM board. The implemented recognition system uses memory size of about 610 kbytes including both program memory and data memory. The recognition rate was 95.86% for ETRI 445DB, and 96.4%, 97.92%, 87.04% for three kinds of name databases collected through the mobile phones.
Keywords
HMM; Speech Recognizer; TMS320C6711; Cepstral Normalization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Markus Lieb, Reinhold Haeb-Umbach, 'LDA derived Cepstral Trajectory Filters in Adverse Environmental Conditions,' International Conference on Acoustic, Speech and Signal Processing, 2000
2 B.H. Juang, L.R. Rabiner, 'The Segmental K-Means Algorithm for Estimation Parameters of Hidden Markov Models,' IEEE Trans, on Acoustics, Speech, and Signal Processing, 38 (9) 1639-1641, 1990   DOI   ScienceOn
3 Fu-Hua Liu, Richard M. Stern, Xuedong Huang, Alejandro Acero, 'Efficient Cepstral Normalization for Robust Speech Recognition,' Proc. of the Sixth ARPA Workshop on Human Language Technology, 1993
4 Y. Zhao, 'A Speaker-Independent Continuous Speech Recognition System Using Continuous Mixture Gaussian Density HMM of Phoneme-Sized Units,' IEEE Trans. on Acoustics, Speech and Signal Processing, 1, No, 3, 345-361, 1993   DOI
5 Akinobu Lee, Tatsuja kawahara, Kiyoshiro Shikano, 'A New Phonetic TIED-MIXTURE MODEL For Efficient Decoding,' International Conference on Acoustic, Speech and Signal Processing, 3 (2) 1269-1271, 2000
6 Alejandro Acero, Acoustical and Environmental Robustness in Automatic Speech Recognition, (Ph.D. thesis, Carnegie Mellon University, 1990)
7 Mark J.F, Gales, Katherine M. Knill 'State-Based Gaussian Selection in Large Vocabulary Continuous Speech Recognition Using HMM's,' IEEE Trans. on Acoustics, Speech, and Signal Processing, 7 (2) 52-161, 1999
8 R. Haeb-Umbach, H,Ney, 'Linear Discriminant Analysis for Improved Large Vocabulary Continuous Speech Recognition,' International Conference on Acoustic, Speech and Signal Processing, 1, 13-16, 1992
9 Texas Instrument, TMS320C6000 Programmer's Guide, 2000