Search | Korea Science

A Study on Methods of Speacker Adaptation for Speech Recognition (음성인식을 위한 화자적응화 기법에 관한 연구)

이종연
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.309.2-314
- /
- 1998
본 연구에서는 음성인식을 위한 화자적응화 기법에 대해 연구하였다. 첫째로 적응화에 포함되지 않은 카테고리 음절에 대해 적응화 효과를 줄 수 있는 보간적응화 방법에 대해 연구하였다. 표준모델과 소량의 음성 데이터만으로 적응화가 가능한 MAPE(최대사후확률추정)으로 적응화한 모델의 평균벡터 변화정도를 적응화 발화에 포함되지 않은 모델에 보간적응하는 방법이다. 둘째로 음절단위 모델을 구축한 후 적응화 하고자 하는 화자의 데이터를 연결학습법과 Viterbi 알고리즘으로 음절단위의 추출을 자동화 한 후 MAPE으로 적응화하는 방법에 대해 각각 실험을 하였다.
PDF

N-gram Adaptation using Information Retrieval and Dynamic Interpolation Coefficient (정보검색 기법과 동적 보간 계수를 이용한 N-gram 적응)

Choi, Joon-Ki;Oh, Yung-Hwan
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.107-112
- /
- 2005
연속음성인식을 위한 언어모델 적응기법은 특정 영역의 정보만을 담고 있는 적응 코퍼스를 이용해 작성한 적응 언어모델과 기본 언어모델을 병합하는 방법이다. 본 논문에서는 추가되는 자료 없이 인식 시스템이보유하고 있는 코퍼스만을 사용하여 적응 코퍼스를 구축하기 위해 언어모델에 기반한 정보검색 기법을 사영하였다. 검색된 적응 코퍼스로 작성된 적응 언어모델과 기본 언어모델과의 병합을 위해 본 논문에서는 입력음성을 분할하여 각 구간에 최적인 동적 보간 계수를 구하는 방법을 제안하였다. 제안된 적응 코퍼스를 구하는 방법과 동적 보간 계수는 기본 언어모델 대비절대 3.6%의 한국어 방송뉴스 인식 성능 향상을 보여주었으며 기존의 검증자료를 이용한 정적 보간 계수에 비해 상대 13.6%의 한국어 방송뉴스 인식 성능 향상을 보여 주었다.
PDF

A Study on Speaker Adaptation of HMM in a Continous Speech Recognition System (HMM을 이용한 연속음성인식 시스템의 화자적응화에 관한 연구)

김상범
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1995.06a
- /
- pp.100-104
- /
- 1995
일반적으로 화자적응화는 이미 학습되어 있는 불특정 화자 모델을 표준모델로 하고 소량의 적응화용 발화로 추가적인 학습을 실시하여 특정화자 모델의 성능에 가깝게 하는 기술로서 연속음성 인식에 있어서 매우 중요하다. ML 추정법을 이용한 화자적응화는 카테고리마다 모델의 학습패턴들을 다수개 준비한 후 학습시에 일괄적으로 적용시켜 모델 파라메터를 추정 갱신하므로 추가되는 화자데이터에 대해 데이터를 모두 공급하여야 한다. 본 연구에서는 문발화 데이터의 음절단위를 자동추출한 후 추가되는 화자데이터가 주어질 때 마다 적응화할 수 있는 화자적응화 방법을 검토하였다. 이 방법은 문발화 데이터를 잘라내지 않고 음절 단위를 자동추출시켜 추가 데이터마다 최대 사후확률 추정법을 이용하여 적응화 시키는 것으로 수소의 데이터로서도 적응화를 가능하게 하는 것이다. 본 연구에서 사용되는 음성데이터는 신문사설에서 발췌한 연속음성 10문장을 사용하고, 이 음성 데이터중 6명분은 HMM 학습용으로 하고 나머지 3명분은 적응화용 및 평가용 데이터로 사용하였다. 6명의 화자를 DDCHMM으로 학습하고 나머지 3명분을 MAP법으로 적응화시켰다. 그 결과 적응전과 비교해 볼 때 약 32%의 인식율 향상을 얻을 수 있었다.
PDF

Online Adaptation of Continuous Density Hidden Markov Models Based on Speaker Space Model Evolution (화자공간모델 진화에 근거한 연속밀도 은닉 마코프모델의 온라인 적응)

Kim Dong Kook;Kim Young Joon;Kim Hyun Woo;Kim Nam Soo
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.69-72
- /
- 2002
본 논문에서 화자공간모델 evolution에 기반한 continuous density hidden Markov model (CDHMM)의 online 적응에 대한 새로운 기법을 제안한다. 학습화자의 a priori knowledge을 나타내는 화자공간모델은 factor analysis (FA) 또는 probabilistic principal component analysis (PPCA)와 같은 은닉변수모델(latent variable model)에 의해 효과적으로 나타내어진다. 은닉 변수모델은 화자공간모델뿐아니라 CDHMM 파라메터의 ajoint prior분포를 표시함으로, maximum a posteriori(MAP)적응기법에 직접 적용되어진다. 화자공간모델의 hyperparameters와 CDHMM파라메터를 동시에 순차적으로 적응하기 위해 quasi-Bayes (QB)추정 기술에 기반한 online 적응기법을 제안한다. 연속숫자음 인식과 관련된 화자적응 실험을 통해 제안된 기법은 적은 적응데이터에서 좋은 성능을 나타내며, 데이터가 증가함에 따라 성능이 지속적으로 증가함을 보여준다.
PDF

Acoustic Model Transformation Method for Speech Recognition Employing Gaussian Mixture Model Adaptation Using Untranscribed Speech Database (미전사 음성 데이터베이스를 이용한 가우시안 혼합 모델 적응 기반의 음성 인식용 음향 모델 변환 기법)

Kim, Wooil
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.5
- /
- pp.1047-1054
- /
- 2015
This paper presents an acoustic model transform method using untranscribed speech database for improved speech recognition. In the presented model transform method, an adapted GMM is obtained by employing the conventional adaptation method, and the most similar Gaussian component is selected from the adapted GMM. The bias vector between the mean vectors of the clean GMM and the adapted GMM is used for updating the mean vector of HMM. The presented GAMT combined with MAP or MLLR brings improved speech recognition performance in car noise and speech babble conditions, compared to singly-used MAP or MLLR respectively. The experimental results show that the presented model transform method effectively utilizes untranscribed speech database for acoustic model adaptation in order to increase speech recognition accuracy.
https://doi.org/10.6109/jkiice.2015.19.5.1047 인용 PDF KSCI KPUBS HTML

Performance Enhancement for Speaker Verification Using Incremental Robust Adaptation in GMM (가무시안 혼합모델에서 점진적 강인적응을 통한 화자확인 성능개선)

Kim, Eun-Young;Seo, Chang-Woo;Lim, Yong-Hwan;Jeon, Seong-Chae
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.3
- /
- pp.268-272
- /
- 2009
In this paper, we propose a Gaussian Mixture Model (GMM) based incremental robust adaptation with a forgetting factor for the speaker verification. Speaker recognition system uses a speaker model adaptation method with small amounts of data in order to obtain a good performance. However, a conventional adaptation method has vulnerable to the outlier from the irregular utterance variations and the presence noise, which results in inaccurate speaker model. As time goes by, a rate in which new data are adapted to a model is reduced. The proposed algorithm uses an incremental robust adaptation in order to reduce effect of outlier and use forgetting factor in order to maintain adaptive rate of new data on GMM based speaker model. The incremental robust adaptation uses a method which registers small amount of data in a speaker recognition model and adapts a model to new data to be tested. Experimental results from the data set gathered over seven months show that the proposed algorithm is robust against outliers and maintains adaptive rate of new data.
https://doi.org/10.7776/ASK.2009.28.3.268 인용 PDF KSCI

Modeling of The Room Transfer Function using Subband Adaptive Digital Filter (Subband 적응 디지털 필터를 이용한 실내전달함수 모델링)

정호문
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1996.10a
- /
- pp.42-45
- /
- 1996
잔향시간이 긴 실내의 전달함수의 모델링에 있어서 , 일반적인 플 밴드 MA 모델에 기초한 적응 필터를 이용한 경우에는, 많은 필터 차수를 필요로 하고 적응 시간이 길어지는 문제점이 있다. 본 논문에서는 필터 차수를 감소시키고 수렴 특성을 향사시키기 위해서, 각 입출력 신호를 몇 개의 주파수 대역으로 나우어서 각각의 주파수 대역에 대새서 적응 처리 과정을 행하는 서브밴드 MA 모델을 이용한 적응디지털 필터 처리 방법을 제안한다. 컴퓨터 시뮬레이션 서브밴드MA 모델을 이용한 디지털 적응 필터 처리과정의 유효성을 나타냈었다.
PDF

지연 예측신경망을 이용한 적응 GPC

정희태
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.7 no.7
- /
- pp.1527-1532
- /
- 2003
기존의 GPC방법으로 제어하기 힘든 비선형성과 플랜트의 변수변화를 포함하는 비선형 플랜트를 지연 예측신경망을 사용하여 효과적으로 제어하는 적응 GPC방법을 제안한다 제안한 방법에서는 플랜트의 선형 변수 추정이나 근사적인 모델로부터 선형 매개변수를 구해서 선형 모델을 만들고 실제 시스템의 출력과 선형모델의 오차를 신경망의 출력으로 표현한 다음, 이 식으로부터 적응 GPC 알고리듬을 유도한다. 여기서 지연 예측신경망은 적응 GPC에 이용될 플랜트의 출력을 예측하도록 학습된다. 이와 같은 제어기를 구성함으로써 선형 변수만으로 적응 GPC 제어기가 구성되어질 경우 생기는 비선형 변수의 추정과 출력 예측 값을 계산하는 번거로움을 해결하였다.
PDF KSCI

Adaptive PID Controller for Nonlinear Systems using Fuzzy Model (퍼지 모델을 이용한 비선형 시스템의 적응 PID 제어기)

Kim, Jong-Hua;Lee, Won-Chang;Kang, Geun-Taek
- Journal of the Korean Institute of Intelligent Systems
- /
- v.13 no.1
- /
- pp.85-90
- /
- 2003
This paper presents an adaptive PID control scheme for nonlinear system. TSK(Takagi-Sugeno-Kang) fuzzy model is used to estimate the error of control input, and the parameters of PID controller are adapted using the error. The parameters of TSK fuzzy model also adapted to plant. The proposed algorithm allows designing adaptive PID controller which Is adapted to the uncertainty of nonlinear plant and the change of parameters. The usefulness of the proposed algorithm is also certificated by the several simulations.
https://doi.org/10.5391/JKIIS.2003.13.1.085 인용 PDF KSCI

Model adaptation employing DNN-based estimation of noise corruption function for noise-robust speech recognition (잡음 환경 음성 인식을 위한 심층 신경망 기반의 잡음 오염 함수 예측을 통한 음향 모델 적응 기법)

Yoon, Ki-mu;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.38 no.1
- /
- pp.47-50
- /
- 2019
This paper proposes an acoustic model adaptation method for effective speech recognition in noisy environments. In the proposed algorithm, the noise corruption function is estimated employing DNN (Deep Neural Network), and the function is applied to the model parameter estimation. The experimental results using the Aurora 2.0 framework and database demonstrate that the proposed model adaptation method shows more effective in known and unknown noisy environments compared to the conventional methods. In particular, the experiments of the unknown environments show 15.87 % of relative improvement in the average of WER (Word Error Rate).
https://doi.org/10.7776/ASK.2019.38.1.047 인용 PDF KSCI HTML

Search Result 1,733, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)