DOI QR코드

DOI QR Code

A Study on Design and Implementation of Speech Recognition System Using ART2 Algorithm

  • Kim, Joeng Hoon (Department of Electronic Communication Eng, Korea Maritime University) ;
  • Kim, Dong Han (Department of Electronic Communication Eng, Korea Maritime University) ;
  • Jang, Won Il (Department of Electronic Communication Eng, Korea Maritime University) ;
  • Lee, Sang Bae (Department of Electronic Communication Eng, Korea Maritime University)
  • 발행 : 2004.09.01

초록

In this research, we selected the speech recognition to implement the electric wheelchair system as a method to control it by only using the speech and used DTW (Dynamic Time Warping), which is speaker-dependent and has a relatively high recognition rate among the speech recognitions. However, it has to have small memory and fast process speed performance under consideration of real-time. Thus, we introduced VQ (Vector Quantization) which is widely used as a compression algorithm of speaker-independent recognition, to secure fast recognition and small memory. However, we found that the recognition rate decreased after using VQ. To improve the recognition rate, we applied ART2 (Adaptive Reason Theory 2) algorithm as a post-process algorithm to obtain about 5% recognition rate improvement. To utilize ART2, we have to apply an error range. In case that the subtraction of the first distance from the second distance for each distance obtained to apply DTW is 20 or more, the error range is applied. Likewise, ART2 was applied and we could obtain fast process and high recognition rate. Moreover, since this system is a moving object, the system should be implemented as an embedded one. Thus, we selected TMS320C32 chip, which can process significantly many calculations relatively fast, to implement the embedded system. Considering that the memory is speech, we used 128kbyte-RAM and 64kbyte ROM to save large amount of data. In case of speech input, we used 16-bit stereo audio codec, securing relatively accurate data through high resolution capacity.

키워드

참고문헌

  1. L.R.Rabiner, B.H.Juang, 'Fundamentals of Speech Recognition', Prentice Hall, 1993
  2. Lawrence Rabiner, 'A Tutorial on Hidden Markov Models and Selected Application in Speech Recognition', Proc. IEEE, Vol.77, No.2, February 1989
  3. Lawrence Rabiner, 'On the Application of Vector Quantization and Hidden Markov Models to Speaker Independent Isolated Word Recognition', Bell System Technical Joural, Vol. 62, No.4, April 1983
  4. Carpenter, G.A., Grossberg S., 'ART2: Self-organization of stable category recognition codes for analog input patterns', Applied Optics, Vol. 26, No. 23, pp. 4919-4930, 1987 https://doi.org/10.1364/AO.26.004919
  5. Joeng Hoon Kim, 'A study on Design and Implementation of Embeded System for Speech Recognition Process', Journal of Fuzzy Logic and Intelligent System, Vol. 14, No. 2, April 2004
  6. Joeng Hoon Kim, 'A study on Deveolpment of Embeded System for speech Recognition using Multi-layer Recurrent Neural Prediction Models & HMM', Journal of Fuzzy Logic and Intelligent System, Vol. 14, No. 3, Junle 2004
  7. Ji Hong Lee & Seoil DSP Technology Research, 'Applications of DSP chip', Seoil DSP Co., Ltd