• Title/Summary/Keyword: MLLR 적응화

Search Result 4, Processing Time 0.018 seconds

A Study on Regression Class Generation of MLLR Adaptation Using State Level Sharing (상태레벨 공유를 이용한 MLLR 적응화의 회귀클래스 생성에 관한 연구)

  • 오세진;성우창;김광동;노덕규;송민규;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.8
    • /
    • pp.727-739
    • /
    • 2003
  • In this paper, we propose a generation method of regression classes for adaptation in the HM-Net (Hidden Markov Network) system. The MLLR (Maximum Likelihood Linear Regression) adaptation approach is applied to the HM-Net speech recognition system for expressing the characteristics of speaker effectively and the use of HM-Net in various tasks. For the state level sharing, the context domain state splitting of PDT-SSS (Phonetic Decision Tree-based Successive State Splitting) algorithm, which has the contextual and time domain clustering, is adopted. In each state of contextual domain, the desired phoneme classes are determined by splitting the context information (classes) including target speaker's speech data. The number of adaptation parameters, such as means and variances, is autonomously controlled by contextual domain state splitting of PDT-SSS, depending on the context information and the amount of adaptation utterances from a new speaker. The experiments are performed to verify the effectiveness of the proposed method on the KLE (The center for Korean Language Engineering) 452 data and YNU (Yeungnam Dniv) 200 data. The experimental results show that the accuracies of phone, word, and sentence recognition system increased by 34∼37%, 9%, and 20%, respectively, Compared with performance according to the length of adaptation utterances, the performance are also significantly improved even in short adaptation utterances. Therefore, we can argue that the proposed regression class method is well applied to HM-Net speech recognition system employing MLLR speaker adaptation.

A Study on Performance Evaluation of HM-Net Adaptation System Using the State Level Sharing (상태레벨 공유를 이용한 HM-Net 적응화 시스템의 성능평가에 관한 연구)

  • 오세진;김광동;노덕규;황철준;김범국;김광수;성우창;정현열
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.397-400
    • /
    • 2003
  • 본 연구에서는 KM-Net(Hidden Markov Network)을 다양한 태스크에의 적용과 화자의 특성을 효과적으로 나타내기 위해 HM-Net 음성인식 시스템에 MLLR(Maximum Likelihood Linear Regression) 적응방법을 도입하였으며, HM-Net 학습 알고리즘을 개량하여 회귀클래스 생성방법을 제안한다. 제안방법은 PDT-SSS(Phonetic Decision Tree-based Successive State Splitting) 알고리즘의 문맥방향 상태분할에 의한 상태레벨 공유를 이용한 방법으로 새로운 화자로부터 문맥정보와 적응화 데이터의 발성 양에 의존하여 결정된 많은 적응 파라미터들을(평균, 분산) 자유롭게 제어할 수 있게 된다. 제안방법의 유효성을 확인하기 위해 국어공학센터(KLE) 452 음성 데이터와 항공편 예약관련 연속음성을 대상으로 인식실험을 수행한 결과, 전체적으로 음소인식의 경우 평균 34-37%, 단어인식의 경우 평균 9%, 연속음성인식의 경우 평균 7-8%의 인식성능 향상을 각각 보였다. 또한 적응화 데이터의 양에 따른 인식성능 비교에서, 제안방법을 적용한 인식 시스템이 적응 데이터의 양이 적은 경우에도 향상된 인식률을 보였으며. 잡음을 부가한 음성에 대한 적응화 실험에서도 향상된 인식성능을 보여 MLLR 적응방법의 특성을 만족하였다. 따라서 MLLR 적응방법을 도입한 HM-Net 음성인식 시스템에 제안한 회귀클래스 생성방법이 유효함을 확인한 수 있었다.

  • PDF

Factored MLLR Adaptation for HMM-Based Speech Synthesis in Naval-IT Fusion Technology (인자화된 최대 공산선형회귀 적응기법을 적용한 해양IT융합기술을 위한 HMM기반 음성합성 시스템)

  • Sung, June Sig;Hong, Doo Hwa;Jeong, Min A;Lee, Yeonwoo;Lee, Seong Ro;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38C no.2
    • /
    • pp.213-218
    • /
    • 2013
  • One of the most popular approaches to parameter adaptation in hidden Markov model (HMM) based systems is the maximum likelihood linear regression (MLLR) technique. In our previous study, we proposed factored MLLR (FMLLR) where each MLLR parameter is defined as a function of a control vector. We presented a method to train the FMLLR parameters based on a general framework of the expectation-maximization (EM) algorithm. Using the proposed algorithm, supplementary information which cannot be included in the models is effectively reflected in the adaptation process. In this paper, we apply the FMLLR algorithm to a pitch sequence as well as spectrum parameters. In a series of experiments on artificial generation of expressive speech, we evaluate the performance of the FMLLR technique and also compare with other approaches to parameter adaptation in HMM-based speech synthesis.

Adaptation and Clustering Method for Speaker Identification with Small Training Data (화자적응과 군집화를 이용한 화자식별 시스템의 성능 및 속도 향상)

  • Kim Se-Hyun;Oh Yung-Hwan
    • MALSORI
    • /
    • no.58
    • /
    • pp.83-99
    • /
    • 2006
  • One key factor that hinders the widespread deployment of speaker identification technologies is the requirement of long enrollment utterances to guarantee low error rate during identification. To gain user acceptance of speaker identification technologies, adaptation algorithms that can enroll speakers with short utterances are highly essential. To this end, this paper applies MLLR speaker adaptation for speaker enrollment and compares its performance against other speaker modeling techniques: GMMs and HMM. Also, to speed up the computational procedure of identification, we apply speaker clustering method which uses principal component analysis (PCA) and weighted Euclidean distance as distance measurement. Experimental results show that MLLR adapted modeling method is most effective for short enrollment utterances and that the GMMs performs better when long utterances are available.

  • PDF