Browse > Article

Performance Improvement of Continuous Digits Speech Recognition using the Transformed Successive State Splitting and Demi-syllable pair  

Kim Dong-Ok (한국정보통신기능대학)
Park No-Jin (서정대학)
Abstract
This paper describes an optimization of a language model and an acoustic model that improve the ability of speech recognition with Korean nit digit. Recognition errors of the language model are decreasing by analysis of the grammatical feature of korean unit digits, and then is made up of fsn-node with a disyllable. Acoustic model make use of demi-syllable pair to decrease recognition errors by inaccuracy division of a phone, a syllable because of a monosyllable, a short pronunciation and an articulation. we have used the k-means clustering algorithm with the transformed successive state splining in feature level for the efficient modelling of the feature of recognition unit . As a result of experimentations, $10.5\%$ recognition rate is raised in the case of the proposed language model. The demi-syllable pair with an acoustic model increased $12.5\%$ recognition rate and $1.5\%$ recognition rate is improved in transformed successive state splitting.
Keywords
Demi-syllable pair; Transformed successive state splitting;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Takami, S. Sagayama, 'A successive state splitting algorithm for efficient allophone modeling', ICASSP-92., p. 573 -576, Mar., 1992
2 Daniel jurafsky & James h. Martin, 'SPEECH and LANGUAGE PROCESSING', Prentice Hall, New Jersey, p.33-53, 2002
3 X. Huang, A. Acero, H.W. Hon, 'Spoken language processing', Prentice Hall PTR, New Jersey, pp.1-5,558-560,655 2001
4 L.R. Rabiner, B.H. Juang, 'Fundamentals of speech recognition', Prentice Hall, New Jersey, chap. 6,pp.15-23,125-128,321-324 1993
5 A. Kannan, M. Ostendorf, J.R. Rohlicek, 'Maximum likelihood clustering of Gaussians for speech recognition', Speech and Audio Processing, IEEE Transactions on , Volume: 2 Issue: 3 pp.453 -455, Jul. 1994   DOI   ScienceOn
6 S. Young, D. Kershaw, J. Odell, D. Ollason, Valtcher, P. Woodland, 'The HTK Book (for HTK Ver.3.2)', Cambridge University Engineering Department, 2002
7 L.R. Rabiner, 'A tutorial on hidden Markov models and selected applications in speech recognition,' Proceedings of the IEEE, Volume: 77 Issue: 2 , pp. 257 -286, Feb. 1989