Fast computation of Observation Probability for Speaker-Independent Real-Time Speech Recognition

Park Dong-Chul;Ahn Ju-Won;

The Journal of Korean Institute of Communications and Information Sciences (한국통신학회논문지)

Volume 30 Issue 9C
/
Pages.907-912
/
2005
/
1226-4717(pISSN)
/
2287-3880(eISSN)

The Korean Institute of Commucations and Information Sciences (한국통신학회)

Fast computation of Observation Probability for Speaker-Independent Real-Time Speech Recognition

실시간 화자독립 음성인식을 위한 고속 확률계산

박동철 (명지대학교 정보공학과 지능컴퓨팅 연구실) ;
안주원 (명지대학교 정보공학과 지능컴퓨팅 연구실)

Published : 2005.09.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

An efficient method for calculation of observation probability in CDHMM(Continous Density Hidden Markov Model) is proposed in this paper. the proposed algorithm, called FCOP(Fast Computation of Observation Probability), approximate obsewation probabilities in CDHMM by eliminating insignificant PDFs(Probability Density Functions) and reduces the computational load. When applied to a speech recognition system, the proposed FCOP algorithm can reduce the instruction cycles by $20\%-30\%$ and can also increase the recognition speed about $30\%$ while minimizing the loss in its recognition rate. When implemented on a practical cellular phone, the FCOP algorithm can increase its recognition speed about $30\%$ while suffering $0.2\%$ loss in recognition rate.

H/W에 구현되는 음성인식 시스템에서 인식속도의 향상을 위한 새로운 알고리즘이 본 논문에서 제안되었다. 제안된 고속 관측확률 계산(Fast Computation of Observation Probability : FCOP) 알고리즘은 관측확률식을 근사화시키는 방법으로, CDHMM에서 상태(state)로 주어지는 확률분포함수들 중에서 일부를 효과적으로 제거하여 계산량을 최소화시키는 방법이다. 실제 H/W 환경의 음성인식에 응용한 실험 결과, 기존의 방법에 비해 인식률의 저하를 최소로 유지하며, 명령어 사이클을 $20\%\~32\%$ 감소시킬 수 있었으며, 인식속도를 약 $30\%$향상시킬 수 있었다. 제안된 알고리즘을 제한된 자원을 가지는 실제의 휴대폰에 탑재하여. 인식속도 및 인식률을 측정한 결과 인식률의 저하를 $0.2\%$ 이하로 유지하면서, 인식속도를 $30\%$ 이상 증가시킬 수 있었다.

Keywords

References

S. Phadke et. al. 'On design and implementation of an embedded automatic speech recognition system,' Proc. of Int. Conf. on VLSI Design 2003, pp. 127-132, 2004 https://doi.org/10.1109/ICVD.2004.1260914
F. Elmisery et. al. 'A FPGA-based Viterbi algorithm implementation for speech recognition system,' Proc. of ICASSP-01, pp. 1217-1200, 2001
S. Melnikoff and S. Quigley, 'Implementing log-add algorithm in hardware,' Electronics Letters, V. 39, No. 12, pp. 939-940, 2003 https://doi.org/10.1049/el:20030594
L. R. Rabiner, B. H. Juang. Fundamentals of speech recognition. Prentice-Hall Inc., 1993
K. Shinoda. and K. Iso, 'Efficient reduction of gaussian components using MDL criterion for HMM-based speech recognition,' Proc. ICASSP-02,, pp 869-872, 2002
T. Watanabe et. al. 'High speed speech recognition using tree structured probability density function,' Proc. ICASSP-95, vol.1, pp 556-559, 1995
S. Renals, 'Phone deactivation pruning in large vocabulary continuous speech recognition,' IEEE Signal Processing Letters, vol. 3, no. 1, 1996
S. Ortmanns et. al. 'An efficient decoding method for real time speech recognition,' Proc. of ESCA, Eurospeech99, pp.499-502, 1999

The Journal of Korean Institute of Communications and Information Sciences (한국통신학회논문지)

Fast computation of Observation Probability for Speaker-Independent Real-Time Speech Recognition

실시간 화자독립 음성인식을 위한 고속 확률계산

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)