Improved Decision Tree-Based State Tying In Continuous Speech Recognition System

;Xintian Wu;Chaojun Liu;;;

The Journal of the Acoustical Society of Korea (한국음향학회지)

Volume 18 Issue 6
/
Pages.49-56
/
1999
/
1225-4428(pISSN)
/
2287-3775(eISSN)

The Acoustical Society of Korea (한국음향학회)

Improved Decision Tree-Based State Tying In Continuous Speech Recognition System

연속 음성 인식 시스템을 위한 향상된 결정 트리 기반 상태 공유

;
Xintian Wu (Computer Science and Engineering, Oregon Graduate Institute of Science and Technology) ;
Chaojun Liu (Computer Science and Engineering, Oregon Graduate Institute of Science and Technology) ;
;

김동화 (밀양대학교 정보통신공학과) ;
;
;
김형순 (부산대학교 전자공학과) ;
김영호 (부산대학교 전자계산학과)

Published : 1999.08.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In many continuous speech recognition systems based on HMMs, decision tree-based state tying has been used for not only improving the robustness and accuracy of context dependent acoustic modeling but also synthesizing unseen models. To construct the phonetic decision tree, standard method performs one-level pruning using just single Gaussian triphone models. In this paper, two novel approaches, two-level decision tree and multi-mixture decision tree, are proposed to get better performance through more accurate acoustic modeling. Two-level decision tree performs two level pruning for the state tying and the mixture weight tying. Using the second level, the tied states can have different mixture weights based on the similarities in their phonetic contexts. In the second approach, phonetic decision tree continues to be updated with training sequence, mixture splitting and re-estimation. Multi-mixture Gaussian as well as single Gaussian models are used to construct the multi-mixture decision tree. Continuous speech recognition experiment using these approaches on BN-96 and WSJ5k data showed a reduction in word error rate comparing to the standard decision tree based system given similar number of tied states.

결정 트리 기반 상태 공유 방법은 HMM을 사용하는 많은 연속 음성 인식 시스템에서 강인하고 정확한 문맥 종속 음향 모델링 뿐만 아니라 훈련 중에는 나타나지 않은 모델들의 합성을 위하여 널리 사용되고 있다. 음성 결정 트리를 구성하기 위한 표준적인 방법은 단일 가우시안 트라이폰 모델을 이용한 1계층 프루닝 만을 사용하고 있다. 본 논문에서는 더욱 정교한 음향 모델링을 통하여 인식 성능 향상을 도모하기 위하여 새로운 2가지 접근 방법 즉, 2계층 결정 트리와 복수 혼합 결정 트리를 제안한다. 2계층 결정 트리는 상태 공유와 혼합 가중치 공유를 위하여 2계층 프루닝을 수행하며, 두 번째 계층을 사용하여 공유 상태들도 음성 문맥의 유사도에 따라서 서로 다른 가중치들을 사용할 수 있다. 두 번째 제안된 방법 에서는 훈련 과정 즉, 혼합 분할 및 재추정 과정과 함께 음성 결정 트리가 계속 갱신되어 진다. 복수 혼합 결정 트리를 구성하기 위하여 단일 가우시안 뿐만 아니라 복수 혼합 가우시안 모델이 함께 사용된다. 제안된 방법들을 이용하여 BN-96과 WSJ5k 데이터를 사용한 연속 음성 인식 실험을 수행한 결과, 표준 결정 트리를 사용한 시스템과 비교하여 공유 상태의 개수를 비슷하게 유지하면서 단어 오인식률을 줄일 수 있었다.

Keywords

References

Automatic Speech And Speaker Recognition C. H. Lee;F. K. Soong;K. K. Paliwai
Proceedings ARPA Workshop on Human Language Technology Tree-Based State Tying for High Accuracy Acoustic Modeling S. J. Young;J. J. Odell;P. C. Woodland
Pattern Recognition in Practice F. Jelinek;R. L. Mercer
IEEE Transactions on Acoustics, Speech and Signal Processing v.37 no.11 Speaker Independent Phone Recognition Using Hidden Markov Models K. F. Lee;H. W. Hon
Proceedings of the International Conference on Acoustics, Speech, and Signal Processing The General Use of Tying in Phoneme-Based HMM Speech Recognisers S. J. Young
Ph. D Thesis, Cambridge The Use of Context in Large Vocabulary Speech Recognition J. J. Odell
IEEE Transactions on Acoustics, Speech and Signal Processing v.38 no.4 Context-Dependent Phonetic Hidden Markov Models for Speaker Independent Continuous Speech Recognition K. F. Lee
Proceedings of the International Conference on Acoustics, Speech and Signal Processing Predicting Unseen Triphone with Senones M. Y. Hwang;X. Huang;F. Alleva
Ph. D Thesis, CMU Subphonetic Acoustic Modeling for Speaker Independent Continuous Speech Recognition M. Y. Hwang
Proceedings of the International Conference on Acoustics, Speech and Signal Processing Decision Trees for Phonological Rules in Continuous Speech L. R. Bahl;P. V. de Souze;P. S. Gopalakrishnan;D. Nahamoo;M. A. Picheny
Proceedings of the International Conference on Acoustics, Speech and Signal Processing Decision Tree State Tying Based on Segmental Clustering For Acoustic Modeling W. Reichl;W. Chou
Proceedings of the International Conference on Acoustics, Speech and Signal Processing Automatic Question Generation for Decision Tree Based State Tying K. Beulen;H. Ney

The Journal of the Acoustical Society of Korea (한국음향학회지)

Improved Decision Tree-Based State Tying In Continuous Speech Recognition System

연속 음성 인식 시스템을 위한 향상된 결정 트리 기반 상태 공유

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)