• Title/Summary/Keyword: 반연속 HMM

Search Result 15, Processing Time 0.018 seconds

Noise Processing for Speech Recognition in the Telephone Line (음성 인식을 위한 전화망에서의 잡음처리)

  • 전원석;신원호;양태영;김원구;윤대희
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.1
    • /
    • pp.4-8
    • /
    • 1998
  • 본 논문에서는 다양한 전화선 채널을 통하여 수집된 음성 데이터에 포함된 잡음 및 채널 왜곡을 제거하여 음성인식 시스템의 성능을 향상시키는 방법에 관하여 연구하였다. 전 화선을 통과한 음성에 포함된 채널 잡음 및 왜곡을 제거하는 방법으로는 음성신호를 보상하 는 방법으로 CMS(Cepstral Mean Subtraction), SBR(Signal Bias Removal)과 SM(Stochastic Matching)의 성능을 비교 평가하였다. 잡음제거 방식의 성능을 평가를 위하 여 음소 단위의 반연속 HMM을 이용한 화자독립 단독음 인식을 수행하였다. 인식 실험 결 과, 멜 켑스트럼을 사용한 경우에 CMS가 가장 우수한 성능을 내었고 다음으로 SM과 SBR 순으로 나타났다. 또한 특징벡터를 주변 잡음에 강인하게 하는 가중함수(RPS, BPL)를 사용 한 켑스트럼 계수와 잡음제거 방식을 함께 사용한 경우에 인식 성능이 더욱 향상되었다.

  • PDF

On Codebook Design to Improve Speaker Adaptation (음성 인식 시스템의 화자 적응 성능 향상을 위한 코드북 설계)

  • Yang, Tae-Young;Shin, Won-Ho;Kim, Weon-Goo;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2
    • /
    • pp.5-11
    • /
    • 1996
  • The purpose of this paper is to propose a method improving the performance of a semi-continuous hidden Markov model(SCHMM) speaker adaptation system which uses Bayesian Parameter reestimation approach. The performance of Bayesian speaker adaptation could be degraded in case that the features of a new speaker are severely different from those of a reference codebook. The excessive codewords of the reference codebook still remain after adaptation proess. which cause confusion in recognition process. To solve such problems, the proposed method uses formant information which is extracted from the cepstral coefficients of the reference codebook and adaptation data. The reference codebook is adapted to represent the formant distribution of a new speaker and it is used for Bayesian speaker adaptation as an initial codebook. The proposed method provides accurate correspondence between reference codebook and adaptation data. It was observed that the excessive codewords were not selected during recognition process. The experimental results showed that the proposed method improved the recognition performance.

  • PDF

Implementation of a Speech Recognition System for a Car Navigation System (차량 항법용 음성인식 시스템의 구현)

  • Lee, Tae-Han;Yang, Tae-Young;Park, Sang-Taick;Lee, Chung-Yong;Youn, Dae-Hee;Cha, Il-Hwan
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.9
    • /
    • pp.103-112
    • /
    • 1999
  • In this paper, a speaker-independent isolated world recognition system for a car navigation system is implemented using a general digital signal processor. This paper presents a method combining SNR normalization with RAS as a noise processing method. The semi-continuous hidden markov model is adopted and TMS320C31 is used in implementing the real-time system. Recognition word set is composed of 69 command words for a car navigation system. Experimental results showed that the recognition performance has a maximum of 93.62% in case of a combination of SNR normalization and spectral subtraction, and the performance improvement rate of the system is 3.69%, Presented noise processing method showed good speech recognition performance in 5dB SNR in car environment.

  • PDF

The Study on the Speaker Adaptation Using Speaker Characteristics of Phoneme (음소에 따른 화자특성을 이용한 화자적응방법에 관한 연구)

  • 채나영;황영수
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2003.06a
    • /
    • pp.6-9
    • /
    • 2003
  • In this paper, we studied on the difference of speaker adaptation according to the phoneme classification for Korean Speech recognition. In order to study of speech adaptation according to the weight of difference of phoneme as recognition unit, we used SCHMM as recognition system. And Speaker adaptation method used in this paper was MAPE(Maximum A Posteriori Probability Estimation), Linear Spectral Estimation. In order to evaluate the performance of these methods, we used 10 Korean isolated numbers as the experimental data. It is possible for the first and the second methods to be carried out unsupervised learning and used in on-line system. And the first method was shown performance improvement over the second method, and hybrid adaptation showed the better recognition results than those which performed each method. And the result of Speaker adaptation using the variable weight according to the phoneme had better than the result using fixed weight.

  • PDF

Speech Recognition Using Noise Robust Features and Spectral Subtraction (잡음에 강한 특징 벡터 및 스펙트럼 차감법을 이용한 음성 인식)

  • Shin, Won-Ho;Yang, Tae-Young;Kim, Weon-Goo;Youn, Dae-Hee;Seo, Young-Joo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.5
    • /
    • pp.38-43
    • /
    • 1996
  • This paper compares the recognition performances of feature vectors known to be robust to the environmental noise. And, the speech subtraction technique is combined with the noise robust feature to get more performance enhancement. The experiments using SMC(Short time Modified Coherence) analysis, root cepstral analysis, LDA(Linear Discriminant Analysis), PLP(Perceptual Linear Prediction), RASTA(RelAtive SpecTrAl) processing are carried out. An isolated word recognition system is composed using semi-continuous HMM. Noisy environment experiments usign two types of noises:exhibition hall, computer room are carried out at 0, 10, 20dB SNRs. The experimental result shows that SMC and root based mel cepstrum(root_mel cepstrum) show 9.86% and 12.68% recognition enhancement at 10dB in compare to the LPCC(Linear Prediction Cepstral Coefficient). And when combined with spectral subtraction, mel cepstrum and root_mel cepstrum show 16.7% and 8.4% enhanced recognition rate of 94.91% and 94.28% at 10dB.

  • PDF