Large Vocabulary Continuous Speech Recognition Based on Language Model Network

;;

한국음향학회지 (The Journal of the Acoustical Society of Korea)

제21권6호
/
Pages.543-551
/
2002
/
1225-4428(pISSN)
/
2287-3775(eISSN)

한국음향학회 (The Acoustical Society of Korea)

언어 모델 네트워크에 기반한 대어휘 연속 음성 인식

Large Vocabulary Continuous Speech Recognition Based on Language Model Network

안동훈 (서강대학교 컴퓨터학과 음성언어처리연구실) ;
정민화 (서강대학교 컴퓨터학과 음성언어처리연구실)

발행 : 2002.08.01

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

이 논문에서는 20,000 단어급의 대어휘를 대상으로 실시간 연속음성 인식을 수행할 수 있는 탐색 방법을 제안한다. 기본적인 탐색 방법은 토큰 전파 방식의 비터비 (Viterbi) 디코딩 알고리듬을 이용한 1 패스로 구성된다. 언어 모델 네트워크를 도입하여 다양한 언어 모델들을 일관된 탐색 공간으로 구성하도록 하였으며, 프루닝(pruning) 단계에서 살아남은 토큰들로부터 동적으로 탐색 공간을 재구성하였다. 용이한 후처리를 위해 워드그래프 및 N개의 최적 문장을 출력할 수 있도록 비터비 알고리듬을 수정하였다. 이렇게 구성된 디코더는 20,000 단어급 데이터 베이스에 대해 테스트하였으며 인식률 및 RTF측면에서 평가되었다.

In this paper, we present an efficient decoding method that performs in real time for 20k word continuous speech recognition task. Basic search method is a one-pass Viterbi decoder on the search space constructed from the novel language model network. With the consistent search space representation derived from various language models by the LM network, we incorporate basic pruning strategies, from which tokens alive constitute a dynamic search space. To facilitate post-processing, it produces a word graph and a N-best list subsequently. The decoder is tested on the database of 20k words and evaluated with respect to accuracy and RTF.

키워드

참고문헌

Formal Languages and Automata Theory V.Drobot
Computer, Speech and Language v.9 Language modelling for efficient beam search M.Federico;M.Cettolo;F.Brugnara;G.Antoniol https://doi.org/10.1006/csla.1995.0017
IEEE Transactions on Acoustics, Speech and Signal Processing v.ASSP-35 Estimation of Probabilities from sparse data for the language model component of a speech recognizer S.M.Katz
Proc. ICASSP-95 Improved backing-off for m-gram language modeling R.Kneser;H.Ney
IEEE Transactions on Signal Processing v.40 no.2 Data driven search organization for continuous speech recognition H.Ney;D.Mergel;A.Noll;A.Paeseler https://doi.org/10.1109/78.124938
Proc. ICASSP-92 Improvements in Beam Search for 10000-Word Continuous Speech Recognition H.Ney;R. Haeb-Umbach;B.H.Tran;M.Oerder
IEEE Transactions on Pattern Analysis and Machine Intelligence v.17 no.12 On the estimation of small probabilities by leaving-one-out H.Ney;U.Essen;R.Kneser https://doi.org/10.1109/34.476512
Proc DARPA Broadcast News Workshop The CUHTK-Entropic 10xRT Broadcast News Transcription System J.J.Odell
Proc. ICSLP-96 Language-model look-ahead for large vocabulary speech recognition S.Ortmanns;H.Ney;A.Eiden
Computer, Speech and Language v.11 no.1 A word graph algorithm for large vocabulary continuous speech recognition S.Ortmanns;H.Ney https://doi.org/10.1006/csla.1996.0022
Proc. ICSLP-96 A word graph based n-best search in continuous speech recognition B.H.Tran;F.Seide;V.Steinbiss
Token Passing: A Simple Conceptual Model for Connected Speech Recognition Systems, CUED-TR-38 S.J.Young;N.H.Russell;J.H.S.Thornton
제 15회 음성통신 및 신호처리 워크샵 의사형태소 단위의 연속 음성 인식 이경님;정민화

한국음향학회지 (The Journal of the Acoustical Society of Korea)

언어 모델 네트워크에 기반한 대어휘 연속 음성 인식

Large Vocabulary Continuous Speech Recognition Based on Language Model Network

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)