[KSCI] Korea Science Citation Index Service

A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition

임영춘 (주식회사 자모바)
오세진 (한국천문연구원 KVN 사업본부)
김광동 (한국천문연구원 KVN 사업본부)
노덕규 (한국천문연구원 KVN 사업본부)
송민규 (한국천문연구원 KVN 사업본부)
정현열 (영남대학교 전자정보공학부)

Publication Information

The Journal of the Acoustical Society of Korea / v.22, no.5, 2003 , pp. 388-402 More about this Journal

Abstract

In this paper, we carried out the word, 4 continuous digits. continuous, and task-independent word recognition experiments to verify the effectiveness of the re-defined phoneme-likely units (PLUs) for the phonetic decision tree based HM-Net (Hidden Markov Network) context-dependent (CD) acoustic modeling in Korean appropriately. In case of the 48 PLUs, the phonemes /ㅂ/, /ㄷ/, /ㄱ/ are separated by initial sound, medial vowel, final consonant, and the consonants /ㄹ/, /ㅈ/, /ㅎ/ are also separated by initial sound, final consonant according to the position of syllable, word, and sentence, respectively. In this paper. therefore, we re-define the 39 PLUs by unifying the one phoneme in the separated initial sound, medial vowel, and final consonant of the 48 PLUs to construct the CD acoustic models effectively. Through the experimental results using the re-defined 39 PLUs, in word recognition experiments with the context-independent (CI) acoustic models, the 48 PLUs has an average of 7.06%, higher recognition accuracy than the 39 PLUs used. But in the speaker-independent word recognition experiments with the CD acoustic models, the 39 PLUs has an average of 0.61% better recognition accuracy than the 48 PLUs used. In the 4 continuous digits recognition experiments with the liaison phenomena. the 39 PLUs has also an average of 6.55% higher recognition accuracy. And then, in continuous speech recognition experiments, the 39 PLUs has an average of 15.08% better recognition accuracy than the 48 PLUs used too. Finally, though the 48, 39 PLUs have the lower recognition accuracy, the 39 PLUs has an average of 1.17% higher recognition characteristic than the 48 PLUs used in the task-independent word recognition experiments according to the unknown contextual factor. Through the above experiments, we verified the effectiveness of the re-defined 39 PLUs compared to the 48PLUs to construct the CD acoustic models in this paper.

Keywords

48; HM-Net; 48; 39 phoneme likely units; HM-net (hidden Markov Network); PDT-SSS algorithm; Context dependent acoustic models;

Citations & Related Records

Times Cited By KSCI : 5 (Citation Analysis)

Reference
Cited By KSCI

1	Allophone clustering for continuous speech recognition / [ K.Lee;S.Hayamizu;H.Hou;C.Huang;J.Swartz;R.Weide ] / Proc. of ICASSP'90
2	결정트리 상태 클러스트링에 의한 HM-Net 구조결정 알고리즘을 이용한 음성인식에 관한 연구 / [ 오세진;황철준;김범국;정호열;정현열 ] / 한국음향학회지 과학기술학회마을
3	/ [ L.Rabiner;B.H.Juang ] / Fundamentals of Speech Recognition
4	A successive state splitting algorithm for efficient allophone modeling / [ J.Takami;S.Sagayama ] / Pro. of ICASSP '92
5	/ [ 中川聖一 ] / 確率モデルによる音聲認識
6	New state clustering of hidden Markov network with Korean Phonological rules for speech recognition / [ S.J.Oh:C.J.Hwang;B.K.Kim;H.Y.Chung;A.Ito ] / IEEE 4th workshop on Multimedia Signal Processing
7	HMM topology design using maximum likelihood successive state splitting / [ Ostendoft;H.Singer ] / Computer Speech and Language DOI ScienceOn
8	Diphone 단위의 hidden Markov model을 이용한 한국어 단어인식 / [ 박현상;은종관,박용규;권오욱 ] / 한국음향학회지 과학기술학회마을
9	음성인식 기능을 가진 주소입력 시스템의 개발과 평가 / [ 김득수;황철준;정현열 ] / 한국음향학회지 과학기술학회마을
10	A new HMnet construction algorithm requiring no contextual factors / [ M.Suzuki;S.Makino;A.Ito;H.Aso;H.Shimodaira ] / IEICE Trans. Info. & Syst.
11	인식 단위로서의 한국어 음절에 관한 연구 / [ 김유진;김회린;정재호 ] / 한국음향학회지 과학기술학회마을
12	기본음소 설정을 위한 음소인식률 이용 방안 연구 / [ 김호경;구명완 ] / 제15회 음성통신 및 신호처리 워크샵 논문집
13	/ [ 이호영 ] / 국어음성학
14	/ [ 배주채 ] / 국어음운론
15	/ [ S.Young;D.Kershaw;J.Odell;D.Ollason;V.Valtchev;P.Woodland ] / The HTK Book
16	가변어휘 음성인식기의 음향모델 개선 밍 성능 분석 / [ 이승훈;김회린 ] / 한국음향학회지 과학기술학회마을

KSCI

A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition 음성인식에서 문맥의존 음향모델의 성능향상을 위한 유사음소단위에 관한 연구

A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition