[KSCI] Korea Science Citation Index Service

A Study on the Optimization of State Tying Acoustic Models using Mixture Gaussian Clustering

Ann, Tae-Ock (Division of Computer, Howon Univ.)

Publication Information

Journal of the Institute of Electronics Engineers of Korea SP / v.42, no.6, 2005 , pp. 167-176 More about this Journal

Abstract

This paper describes how the state tying model based on the decision tree which is one of Acoustic models used for speech recognition optimizes the model by reducing the number of mixture Gaussians of the output probability distribution. The state tying modeling uses a finite set of questions which is possible to include the phonological knowledge and the likelihood based decision criteria. And the recognition rate can be improved by increasing the number of mixture Gaussians of the output probability distribution. In this paper, we'll reduce the number of mixture Gaussians at the highest point of recognition rate by clustering the Gaussians. Bhattacharyya and Euclidean method will be used for the distance measure needed when clustering. And after calculating the mean and variance between the pair of lowest distance, the new Gaussians are created. The parameters for the new Gaussians are derived from the parameters of the Gaussians from which it is born. Experiments have been performed using the STOCKNAME (1,680) databases. And the test results show that the proposed method using Bhattacharyya distance measure maintains their recognition rate at $97.2\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$ . And the method using Euclidean distance measure shows that it maintains the recognition rate at $96.9\%$ and reduces the ratio of the number of mixture Gaussians by $1.0\%$ . Then the methods can optimize the state tying model.

Keywords

Speech Recognition; Signal Processing; Acoustic Model; State Tying; Clustering;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	A. Karnnan, M. Ostendorf, J.R. Rohlicek, 'Maximum likelihood clustering of Gaussians for speech recognition', Speech and Audio Processing, IEEE Transactions on , Volume: 2 Issue: 3 pp.453-455, Jul. 1994 DOI ScienceOn
2	J. J. Odell, 'The use of context in large vocabulary speech recognition', PhD's Dissertation. University of Cambridge. 1995
3	K. Fukunaga, 'Introduction to statistical pattern recognition', Morgan Kaufman, San Francisco, p.97-99, 1990
4	오세진, 황철준, 김범국, 정호열, 정현열, '결정트리 상태 클러스터링에 의한 HM-net 구조결정 알고리즘을 이용한 음성인식에 관한 연구', 한국음향학회지 제 21권 제2호, pp. 199-210, 2002 과학기술학회마을
5	J. Takami, S. Sagayama, 'A successive state splitting algorithm for efficient allophone modeling', ICASSP-92., p, 573-576, Mar., 1992 DOI
6	S. J. Young, J. J. Odell, and P. C. Woodland, 'Tree based state tying forhigh accuracy modeling,' in ARPA Workshop Human Language Technology,Princeton, NJ, pp. 286-291, Mar. 1994 DOI
7	S. Takahashi. S. Sagayama, 'Four-level tied-structure for efficient representation of acoustic modeling', ICASSP-95, International Conference on , Vol.: 1 , pp. 520-523, May 1995 DOI
8	J. R. Bellegarda, D. Nahamoo, 'Tied mixture continuous parameter modeling for speech recognition', Acoustics, Speech, and Signal Processing, IEEE Transactions on , Volume: 38 Issue: 12 pp. 2033-2045, Dec. 1990 DOI ScienceOn
9	W. Reichl, Wu Chou, 'Robust decision tree state tying for continuous speech recognition', Speech and Audio Processing, IEEE Transactions on , Volume: 8 Issue: 5 pp. 555-566, Sep. 2000 DOI ScienceOn
10	L. R. Rabiner, 'A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,' Pro. IEEE, vol 77, no. 2, pp. 257-286, 1989 DOI ScienceOn
11	K. F. Lee,'Context-dependent phonetic hidden Markov models for speaker-independent continuous speech recognition',Acoustics, Speech, and Signal Processing, IEEE Transactions on , Volume: 38 Issue: 4 pp. 599-609, Apr. 1990 DOI ScienceOn
12	L. R. Rabiner, B.H. Juang, 'Fundamentals of speech recognition', Prentice Hall, New Jersey, chap. 6, 1993
13	S. Young, D. Kershaw, J. Odell, D. Ollason, Valtcher, P. Woodland, 'The HTK Book, Cambridge University Engineering Department, 2002

1	Efficient context dependent process modeling using state tying and decision tree-based method / [Ahn, Chan-Shik;Oh, Sang-Yeob;] / Journal of Korea Multimedia Society
2	Gaussian Optimization of Vocabulary Recognition Clustering Model using Configuration Thread Control / [Ahn, Chan-Shik;Oh, Sang-Yeob;] / Journal of the Korea Society of Computer and Information
3	Efficient Continuous Vocabulary Clustering Modeling for Tying Model Recognition Performance Improvement / [Ahn, Chan-Shik;Oh, Sang-Yeob;] / Journal of the Korea Society of Computer and Information

KSCI

A Study on the Optimization of State Tying Acoustic Models using Mixture Gaussian Clustering 혼합 가우시안 군집화를 이용한 상태공유 음향모델 최적화

A Study on the Optimization of State Tying Acoustic Models using Mixture Gaussian Clustering