[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9718/JBER.2006.27.3.101

HMM-Based Automatic Speech Recognition using EMG Signal

Lee Ki-Seung (Department of Electronic Engineering, Konkuk University)

Publication Information

Journal of Biomedical Engineering Research / v.27, no.3, 2006 , pp. 101-109 More about this Journal

Abstract

It has been known that there is strong relationship between human voices and the movements of the articulatory facial muscles. In this paper, we utilize this knowledge to implement an automatic speech recognition scheme which uses solely surface electromyogram (EMG) signals. The EMG signals were acquired from three articulatory facial muscles. Preliminary, 10 Korean digits were used as recognition variables. The various feature parameters including filter bank outputs, linear predictive coefficients and cepstrum coefficients were evaluated to find the appropriate parameters for EMG-based speech recognition. The sequence of the EMG signals for each word is modelled by a hidden Markov model (HMM) framework. A continuous word recognition approach was investigated in this work. Hence, the model for each word is obtained by concatenating the subword models and the embedded re-estimation techniques were employed in the training stage. The findings indicate that such a system may have a capacity to recognize speech signals with an accuracy of up to 90%, in case when mel-filter bank output was used as the feature parameters for recognition.

Keywords

surface EMG signals; automatic speech recognition; hidden markov model;

Citations & Related Records

Reference

1	H.-J. Park, S.-H. Kwon, H.-C. Kim, and K.-S. Park, 'Adaptive EMG-driven communication for the disability,' in Proc. 1st Joint BMES/EMBS Conference, Atlanta, GA, USA, 1999, pp. 656
2	A.D.C. Chan, K. Englehart, B. Hudgins, and D.F. Lovely, 'Myoelectric signals to augment speech recognition' Med. Biol. Eng. Comput., pp. 500-504, 2001 DOI ScienceOn
3	F. Grandori, P. Pinelli, P. Ravazzani, F. Ceriani, G. Miscio, F. Pisano, R. Colombo, S. Insalaco, and G. Tognola, 'Multiparametric analysis ofspeech production mechanisms,' IEEE EMB Magazine, vol. 13, issue 2, pp. 203-209, 1995
4	E.A. Goldstein, J.T. Heaton, J.B. Kobler, G.B. Stanley, and R.E. Hiiman, 'Design and implementation of ahands-free electolarynxesign device controlled by neck strap muscle electromyographic activity,' IEEE Trans. Biomed. Eng., vol. 51, no. 2, pp. 325-332, 2004 DOI ScienceOn
5	C. Jorgensen and D.D. Lee, and S. Agabon, 'Sub auditory speech recognition based on EMG signals,' in Proc. the International Joint Conference on Neural Network, vol. 4, 2003, pp. 3128-3133
6	A. Dempster, N. Laird, and D. Rubin, 'Maximum likelihood from incomplete data via the EM algorithm,' Journal of Royal Statistical Society, vol. 39, pp. 1-38, 1977
7	M. Beutnagel, A. Conkie, J. Schroeter, Y. Stylianou, and A. Syrdal, 'The AT&T Next-Gen TTS system,' in Proc. the Joint Meeting of ASA, EAA, and DAGA, Berlin, Germany, March 1999
8	L.R. Rabiner, J.G. Wilpon and F.K. Soong, ''High performance connected digit recognition using hidden Markov models,' IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, issue 8, pp. 1214-1225, 1989 DOI ScienceOn
9	L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Englewood Cliffs, NJ, USA: Prentice-Hall, 1993
10	G.M. White and R.B. Neely, 'Speech recognition experiments with linear prediction, bandpass filtering, and dynamic programming,' IEEE Trans. Acoustics, Speech and Signal Processing, vol. ASSP-24, no. 2, pp. 183-188, 1976
11	S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev and P. Woodland, HTK Speech Recognition Toolkit, Available: http://htk.eng.cam.ac.uk
12	H. Manabe, and Z. Zhang, 'Multi-streamHMM for EMG-based speech recognition,' in Proc. 26th Annual International Conference of the IEEE EMBS, San Francisco, CA, USA, 2004, pp.4389-4392
13	K. Ogino and W.M. Kozak, 'Spectrum analysis of surface electromyogram,' in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, Boston, MA, USA, 1983, pp. 1114-1117
14	R.S. Kumaran, K. Narayanan, and J.N. Gowdy, 'Myoelectric signals for multimodal speech recognition,' in Proc. 2005 EUROSPEECH, Lisboa, Portugal, 2005, pp. 1189-1192
15	A.D.C. Chan, K. Englehart, B. Hudgins, and D.F. Lovely, ''Hidden Markov Model classification of myoelectics signals in speech,' IEEE EMB Magazine, vol. 9, pp. 143-146, 2002
16	S. Kumar, D.K. Kumar, M. Alemu, and M. Burry, 'EMG based voice recognition,' in Proc. 2004 Intelligent Sensor, Sensor Networks and Information Processing Conference, 2004, pp. 597-596
17	L.R. Rabiner, and R.W. Schafer, Digital Processing of Speech Signal, Englewood Cliffs, NJ, USA: Prentice Hall, 1978
18	B. Fisher, Tsylb2-1.1 Syllabification software, Available: http://www.nist.gov/speech/tools, August, 1996