A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition

;;;;;;

The Journal of the Acoustical Society of Korea (한국음향학회지)

Volume 22 Issue 5
/
Pages.388-402
/
2003
/
1225-4428(pISSN)
/
2287-3775(eISSN)

The Acoustical Society of Korea (한국음향학회)

A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition

음성인식에서 문맥의존 음향모델의 성능향상을 위한 유사음소단위에 관한 연구

임영춘 (주식회사 자모바) ;
오세진 (한국천문연구원 KVN 사업본부) ;
김광동 (한국천문연구원 KVN 사업본부) ;
노덕규 (한국천문연구원 KVN 사업본부) ;
송민규 (한국천문연구원 KVN 사업본부) ;
정현열 (영남대학교 전자정보공학부)

Published : 2003.07.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we carried out the word, 4 continuous digits. continuous, and task-independent word recognition experiments to verify the effectiveness of the re-defined phoneme-likely units (PLUs) for the phonetic decision tree based HM-Net (Hidden Markov Network) context-dependent (CD) acoustic modeling in Korean appropriately. In case of the 48 PLUs, the phonemes /ㅂ/, /ㄷ/, /ㄱ/ are separated by initial sound, medial vowel, final consonant, and the consonants /ㄹ/, /ㅈ/, /ㅎ/ are also separated by initial sound, final consonant according to the position of syllable, word, and sentence, respectively. In this paper. therefore, we re-define the 39 PLUs by unifying the one phoneme in the separated initial sound, medial vowel, and final consonant of the 48 PLUs to construct the CD acoustic models effectively. Through the experimental results using the re-defined 39 PLUs, in word recognition experiments with the context-independent (CI) acoustic models, the 48 PLUs has an average of 7.06%, higher recognition accuracy than the 39 PLUs used. But in the speaker-independent word recognition experiments with the CD acoustic models, the 39 PLUs has an average of 0.61% better recognition accuracy than the 48 PLUs used. In the 4 continuous digits recognition experiments with the liaison phenomena. the 39 PLUs has also an average of 6.55% higher recognition accuracy. And then, in continuous speech recognition experiments, the 39 PLUs has an average of 15.08% better recognition accuracy than the 48 PLUs used too. Finally, though the 48, 39 PLUs have the lower recognition accuracy, the 39 PLUs has an average of 1.17% higher recognition characteristic than the 48 PLUs used in the task-independent word recognition experiments according to the unknown contextual factor. Through the above experiments, we verified the effectiveness of the re-defined 39 PLUs compared to the 48PLUs to construct the CD acoustic models in this paper.

Keywords

References

Fundamentals of Speech Recognition L.Rabiner;B.H.Juang
確率モデルによる音聲認識中川聖一
한국음향학회지 v.13 no.1 Diphone 단위의 hidden Markov model을 이용한 한국어 단어인식 박현상;은종관,박용규;권오욱
한국음향학회지 v.18 no.8 가변어휘 음성인식기의 음향모델 개선 밍 성능 분석 이승훈;김회린
한국음향학회지 v.16 no.3 인식 단위로서의 한국어 음절에 관한 연구 김유진;김회린;정재호
제15회 음성통신 및 신호처리 워크샵 논문집 기본음소 설정을 위한 음소인식률 이용 방안 연구 김호경;구명완
Pro. of ICASSP '92 v.1 A successive state splitting algorithm for efficient allophone modeling J.Takami;S.Sagayama
IEICE Trans. Info. & Syst. v.E78-D no.6 A new HMnet construction algorithm requiring no contextual factors M.Suzuki;S.Makino;A.Ito;H.Aso;H.Shimodaira
Computer Speech and Language v.11 HMM topology design using maximum likelihood successive state splitting Ostendoft;H.Singer https://doi.org/10.1006/csla.1996.0021
IEEE 4th workshop on Multimedia Signal Processing New state clustering of hidden Markov network with Korean Phonological rules for speech recognition S.J.Oh:C.J.Hwang;B.K.Kim;H.Y.Chung;A.Ito
Proc. of ICASSP'90 Allophone clustering for continuous speech recognition K.Lee;S.Hayamizu;H.Hou;C.Huang;J.Swartz;R.Weide
The HTK Book S.Young;D.Kershaw;J.Odell;D.Ollason;V.Valtchev;P.Woodland
한국음향학회지 v.18 no.2 음성인식 기능을 가진 주소입력 시스템의 개발과 평가 김득수;황철준;정현열
한국음향학회지 v.21 no.2 결정트리 상태 클러스트링에 의한 HM-Net 구조결정 알고리즘을 이용한 음성인식에 관한 연구 오세진;황철준;김범국;정호열;정현열
국어음성학 이호영
국어음운론 배주채

The Journal of the Acoustical Society of Korea (한국음향학회지)

A Study on Phoneme Likely Units to Improve the Performance of Context-dependent Acoustic Models in Speech Recognition

음성인식에서 문맥의존 음향모델의 성능향상을 위한 유사음소단위에 관한 연구

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)