[KSCI] Korea Science Citation Index Service

Performance Improvement of Continuous Digits Speech Recognition using the Transformed Successive State Splitting and Demi-syllable pair

Kim Dong-Ok (한국정보통신기능대학)
Park No-Jin (서정대학)

Publication Information

Journal of the Korea Institute of Information and Communication Engineering / v.9, no.8, 2005 , pp. 1625-1631 More about this Journal

Abstract

This paper describes an optimization of a language model and an acoustic model that improve the ability of speech recognition with Korean nit digit. Recognition errors of the language model are decreasing by analysis of the grammatical feature of korean unit digits, and then is made up of fsn-node with a disyllable. Acoustic model make use of demi-syllable pair to decrease recognition errors by inaccuracy division of a phone, a syllable because of a monosyllable, a short pronunciation and an articulation. we have used the k-means clustering algorithm with the transformed successive state splining in feature level for the efficient modelling of the feature of recognition unit . As a result of experimentations, $10.5\%$ recognition rate is raised in the case of the proposed language model. The demi-syllable pair with an acoustic model increased $12.5\%$ recognition rate and $1.5\%$ recognition rate is improved in transformed successive state splitting.

Keywords

Demi-syllable pair; Transformed successive state splitting;

Citations & Related Records

Reference

1	J. Takami, S. Sagayama, 'A successive state splitting algorithm for efficient allophone modeling', ICASSP-92., p. 573 -576, Mar., 1992
2	Daniel jurafsky & James h. Martin, 'SPEECH and LANGUAGE PROCESSING', Prentice Hall, New Jersey, p.33-53, 2002
3	X. Huang, A. Acero, H.W. Hon, 'Spoken language processing', Prentice Hall PTR, New Jersey, pp.1-5,558-560,655 2001
4	L.R. Rabiner, B.H. Juang, 'Fundamentals of speech recognition', Prentice Hall, New Jersey, chap. 6,pp.15-23,125-128,321-324 1993
5	A. Kannan, M. Ostendorf, J.R. Rohlicek, 'Maximum likelihood clustering of Gaussians for speech recognition', Speech and Audio Processing, IEEE Transactions on , Volume: 2 Issue: 3 pp.453 -455, Jul. 1994 DOI ScienceOn
6	S. Young, D. Kershaw, J. Odell, D. Ollason, Valtcher, P. Woodland, 'The HTK Book (for HTK Ver.3.2)', Cambridge University Engineering Department, 2002
7	L.R. Rabiner, 'A tutorial on hidden Markov models and selected applications in speech recognition,' Proceedings of the IEEE, Volume: 77 Issue: 2 , pp. 257 -286, Feb. 1989

KSCI

Performance Improvement of Continuous Digits Speech Recognition using the Transformed Successive State Splitting and Demi-syllable pair 반음절쌍과 변형된 연쇄 상태 분할을 이용한 연속 숫자음 인식의 성능 향상

Performance Improvement of Continuous Digits Speech Recognition using the Transformed Successive State Splitting and Demi-syllable pair