Stochastic Pronunciation Lexicon Modeling for Large Vocabulary Continous Speech Recognition

Yun, Seong-Jin;Choi, Hwan-Jin;Oh, Yung-Hwan;

한국음향학회지 (The Journal of the Acoustical Society of Korea)

제16권2호
/
Pages.49-57
/
1997
/
1225-4428(pISSN)
/
2287-3775(eISSN)

한국음향학회 (The Acoustical Society of Korea)

확률 발음사전을 이용한 대어휘 연속음성인식

Stochastic Pronunciation Lexicon Modeling for Large Vocabulary Continous Speech Recognition

윤성진 (한국과학기술원 전산학과) ;
최환진 (한국과학기술원 전산학과) ;
오영환 (한국과학기술원 전산학과)

발행 : 1997.04.01

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 대어휘 연속음성인식을 위한 확률 발음사전 모델에 대해서 제안하였다. 확률 발음 사전은 HMM과 같이 단위음소 상태의 Markov chain으로 이루어져 있으며, 각 음소 상태들은 음소들에 대한 확률 분포 함수로 표현된다. 확률 발음 사전의 생성은 음성자료와 음소 모델을 이용하여 음소 단위의 분할과 인식을 통해서 자동으로 생성되게 된다. 제안된 확률 발음 사전은 단어내 변이와 단어간 변이를 모두 효과적으로 표현할 수 있었으며, 인식 모델과 인식기의 특성을 반영함으로써 전체 인식 시스템의 성능을 보다 높일 수 있었다. 3000 단어 연속음성인식 실험 결과 확률 발음 사전을 사용함으로써 표준 발음 표기를 사용하는 인식 시스템에 비해 단어 오류율은 23.6%, 문장 오류율은 10% 정도를 감소시킬 수 있었다.

In this paper, we propose the stochastic pronunciation lexicon model for large vocabulary continuous speech recognition system. We can regard stochastic lexicon as HMM. This HMM is a stochastic finite state automata consisting of a Markov chain of subword states and each subword state in the baseform has a probability distribution of subword units. In this method, an acoustic representation of a word can be derived automatically from sample sentence utterances and subword unit models. Additionally, the stochastic lexicon is further optimized to the subword model and recognizer. From the experimental result on 3000 word continuous speech recognition, the proposed method reduces word error rate by 23.6% and sentence error rate by 10% compare to methods based on standard phonetic representations of words.

한국음향학회지 (The Journal of the Acoustical Society of Korea)

확률 발음사전을 이용한 대어휘 연속음성인식

Stochastic Pronunciation Lexicon Modeling for Large Vocabulary Continous Speech Recognition

초록

키워드

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)