[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.4218/etrij.10.0209.0242

Maximum Likelihood Training and Adaptation of Embedded Speech Recognizers for Mobile Environments

Cho, Young-Kyu (Speech Information Processing Laboratory, Department of Computer and Communication Engineering, Korea University)
Yook, Dong-Suk (Speech Information Processing Laboratory, Department of Computer and Communication Engineering, Korea University)

Publication Information

ETRI Journal / v.32, no.1, 2010 , pp. 160-162 More about this Journal

Abstract

For the acoustic models of embedded speech recognition systems, hidden Markov models (HMMs) are usually quantized and the original full space distributions are represented by combinations of a few quantized distribution prototypes. We propose a maximum likelihood objective function to train the quantized distribution prototypes. The experimental results show that the new training algorithm and the link structure adaptation scheme for the quantized HMMs reduce the word recognition error rate by 20.0%.

Keywords

Embedded speech recognition; maximum likelihood distribution clustering (MLDC); quantized HMM;

Citations & Related Records

Times Cited By Web Of Science : 1 (Related Records In Web of Science)
Times Cited By SCOPUS : 1

Reference

1	J.J. Odell, The Use of Context in Large Vocabulary Speech Recognition, PhD Thesis, Cambridge University, 1995.
2	K. Wong and B. Mak, "MAP Adaptation with Subspace Regression Classes and Tying," IEEE Proc. Int. Conf. Acoust., Speech, Signal Process., vol. 3, 2000, pp. 1551-1554.
3	K. Wong and B. Mak, "Rapid Speaker Adaptation Using MLLR and Subspace Regression Classes," Proc. European Conf. Speech Commun. Technol., vol. 2, 2001, pp. 1253-1256.
4	M. Zhang and J. Xu, "An Investigation into Subspace Rapid Speaker Adaptation," IEEE Proc. Int. Symp. Chinese Spoken Language Process., 2004, pp. 273-276.
5	D. Kim and D. Yook, "Linear Spectral Transformation for Robust Speech Recognition Using Maximum Mutual Information," IEEE Signal Process. Lett., vol. 14, 2007, pp. 496-499. DOI
6	Y. Cho and D. Yook, "Rapid Adaptation Using Linear Spectral Transformation for Embedded Speech Recognizers," IET Electron. Lett., vol. 44, no. 17, 2008, pp. 1040-1042. DOI ScienceOn
7	C.J. Leggetter and P.C. Woodland, "Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models," Computer Speech and Language, vol. 9, 1995, pp. 171-185. DOI ScienceOn
8	L.E. Baum, "An Inequality and Associated Maximization Technique in Statistical Estimation of Probabilistic Functions of Markov Processes," Inequalities, vol. 3, 1972, pp. 1-8.
9	E. Bocchieri and B. Mak, "Subspace Distribution Clustering Hidden Markov Model," IEEE Trans. Speech Audio Process., vol. 9, 2001, pp. 264-276. DOI ScienceOn