Browse > Article

English Phoneme Recognition using Segmental-Feature HMM  

Yun, Young-Sun (한남대학교 정보통신·멀티미디어공학부)
Abstract
In this paper, we propose a new acoustic model for characterizing segmental features and an algorithm based upon a general framework of hidden Markov models (HMMs) in order to compensate the weakness of HMM assumptions. The segmental features are represented as a trajectory of observed vector sequences by a polynomial regression function because the single frame feature cannot represent the temporal dynamics of speech signals effectively. To apply the segmental features to pattern classification, we adopted segmental HMM(SHMM) which is known as the effective method to represent the trend of speech signals. SHMM separates observation probability of the given state into extra- and intra-segmental variations that show the long-term and short-term variabilities, respectively. To consider the segmental characteristics in acoustic model, we present segmental-feature HMM(SFHMM) by modifying the SHMM. The SFHMM therefore represents the external- and internal-variation as the observation probability of the trajectory in a given state and trajectory estimation error for the given segment, respectively. We conducted several experiments on the TIMIT database to establish the effectiveness of the proposed method and the characteristics of the segmental features. From the experimental results, we conclude that the proposed method is valuable, if its number of parameters is greater than that of conventional HMM, in the flexible and informative feature representation and the performance improvement.
Keywords
Speech Recognition; Segmental Feature; Segmental HMM; Segmental-feature HMM;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Deng, L. and Aksmanovic, M. and Sun, Du. and Wu, J., 'Speech recognition using hidden Markov models with polynomial regression functions as non-stationary states,' IEEE Trans. on Speech and Audio Proc., vol. 2, no. 4, pp, 507-520, 1994   DOI   ScienceOn
2 Gales, M.J.F. and Young, S.J. 'The Theory of Segmental Hidden Markov Models,' CUED/FlNFENG/TR 133, Cambridge University Engineering Department, Trumpington Street, Cambridge CB2 IPZ, England, 1993
3 Gish, H. and Ng, K. 'A segmental speech model with application to word spotting,' In International Conference on Acoustics, Speech and Signal Processing 1993, vol. 2, pp. 447-450, Minneapolis, Minnesota, 1993   DOI
4 Ostendorf, M. and Digalakis, V. and Kimball, O.A 'From HMM's to Segmental Models: A Unified View of Stochastic Modeling for Speech Recognition,' IEEE Trans. on Speech and Audio Processing, vol. 4, no. 5, pp, 360-378, 1996   DOI   ScienceOn
5 Press, W.H. and Teukolsky, A.A. and Vetterling, W.T. and Flannery, B.P. Numerical Recipes in C, 2nd Ed. Cambridge University Press, pp. 671-680, 1992
6 Russell, M. 'A segmental HMM for speech pattern modeling,' In International Conference on Acoustics, Speech and Signal Processing 1993, vol. 2, pp. 499-502, Minneapolis, Minnesota, 1993   DOI
7 Fukada, T. and Sagisaka, Y. and Paliwal, K. Model Parameter Estimation For Mixture Density Polynomial Segment Models, In International Conference on Acoustics, Speech and Signal Processing 1997, Munich, Germany, pp, 1403-1406, April 1997   DOI
8 Ostendorf, M. and Roukos, S. 'A stochastic segment model for phoneme-based continuous speech recognition,' IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 37, no. 2, pp. 1857-1869, 1989   DOI   ScienceOn
9 Furui, S. 'Speaker-Independent Isolated Word Recognition Using Dynamic Features of Speech Spectrum,' IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 34, no. 1, pp. 52-59, 1986   DOI
10 Holmes, W.J, and Russell, M.J., 'Probabilistic-trajectory segmental HMMs,' Computer Speech and Language, vol 13, pp. 3-37, 1999   DOI   ScienceOn
11 Deng, L. 'A generalized hidden Markov model with state-conditioned trend functions of time for speech signal,' Signal Processing, vol. 27, pp. 65-78, 1992   DOI   ScienceOn
12 최인정, HMM에 기반한 음성 인식에서 음향학적 문맥 정보의 결합, 박사학위 논문, KAIST, 1999
13 Gish, H. and Ng, K. Parametric trajectory models for speech recognition. In International Conference on Spoken Language Processing 1996, pp. 466469, Philadelphia, Oct. 1996   DOI
14 Lee, K. and Hon, H. Speaker-independent phone recognition using hidden Markov models, IEEE Trans. On Acoustics, Speech and Signal Processing, vol. 37, no 11, pp.1661-1648, Nov. 1989   DOI   ScienceOn