Search | Korea Science

A New Speaker Adaptation Technique using Maximum Model Distance

Lee, Man-Hyung;Hong, Suh-Il
- 제어로봇시스템학회:학술대회논문집
- /
- 2001.10a
- /
- pp.99.1-99
- /
- 2001
This paper presented an adaptation approach based on maximum model distance (MMD) method. This method shares the same framework as they are used for training speech recognizers with abundant training data. The MMD method could adapt to all the models with or without adaptation data. If large amount of adaptation data is available, these methods could gradually approximate the speaker-dependent ones. The approach is evaluated through the phoneme recognition task on the TIMIT corpus. On the speaker adaptation experiments, up to 65.55% phoneme error reduction is achieved. The MMD could reduce phoneme error by 16.91% even when only one adaptation utterance is used.
PDF

Performance Comparison and Duration Model Improvement of Speaker Adaptation Methods in HMM-based Korean Speech Synthesis (HMM 기반 한국어 음성합성에서의 화자적응 방식 성능비교 및 지속시간 모델 개선)

Lee, Hea-Min;Kim, Hyung-Soon
- Phonetics and Speech Sciences
- /
- v.4 no.3
- /
- pp.111-117
- /
- 2012
In this paper, we compare the performance of several speaker adaptation methods for a HMM-based Korean speech synthesis system with small amounts of adaptation data. According to objective and subjective evaluations, a hybrid method of constrained structural maximum a posteriori linear regression (CSMAPLR) and maximum a posteriori (MAP) adaptation shows better performance than other methods, when only five minutes of adaptation data are available for the target speaker. During the objective evaluation, we find that the duration models are insufficiently adapted to the target speaker as the spectral envelope and pitch models. To alleviate the problem, we propose the duration rectification method and the duration interpolation method. Both the objective and subjective evaluations reveal that the incorporation of the proposed two methods into the conventional speaker adaptation method is effective in improving the performance of the duration model adaptation.
https://doi.org/10.13064/KSSS.2012.4.3.111 인용 PDF

Performance Improvement of Fast Speaker Adaptation Based on Dimensional Eigenvoice and Adaptation Mode Selection (차원별 Eigenvoice와 화자적응 모드 선택에 기반한 고속화자적응 성능 향상)

송화전;이윤근;김형순
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.1
- /
- pp.48-53
- /
- 2003
Eigenvoice method is known to be adequate for fast speaker adaptation, but it hardly shows additional improvement with increased amount of adaptation data. In this paper, to deal with this problem, we propose a modified method estimating the weights of eigenvoices in each feature vector dimension. We also propose an adaptation mode selection scheme that one method with higher performance among several adaptation methods is selected according to the amount of adaptation data. We used POW DB to construct the speaker independent model and eigenvoices, and utterances(ranging from 1 to 50) from PBW 452 DB and the remaining 400 utterances were used for adaptation and evaluation, respectively. With the increased amount of adaptation data, proposed dimensional eigenvoice method showed higher performance than both conventional eigenvoice method and MLLR. Up to 26% of word error rate was reduced by the adaptation mode selection between eigenvoice and dimensional eigenvoice methods in comparison with conventional eigenvoice method.
PDF KSCI

A Study on Dynamic Adaptation of Soft Keyboard Using Adjacent-Typo (인접-오타를 이용한 소프트 키보드의 동적 적응 연구)

Ko, Seokhoon
- Journal of Korea Multimedia Society
- /
- v.21 no.11
- /
- pp.1263-1270
- /
- 2018
Dynamic adaptation method is an effective technique to enhance the usability by personalizing the soft keyboard layout using the user's key input information. In this paper, we propose a dynamic adaptation method of a keyboard by automatically extracting typos from key input information and using adjacent-typo information classified through the relationship between typos. This technique does not limit a range of adaptation to the inside of the key but extends the range to the neighbor key so that the adaptation effect can be achieved in a wide range at a high speed, thereby the proposed method improves the usability of the keyboard with a small number of inputs. The proposed method showed 25% increase in usability compared to the existing method through experiment and it was confirmed that usability improves up to 33% when used with the existing method.
https://doi.org/10.9717/kmms.2018.21.11.1263 인용 PDF KSCI HTML

Large Scale Voice Dialling using Speaker Adaptation (화자 적응을 이용한 대용량 음성 다이얼링)

Kim, Weon-Goo
- Journal of Institute of Control, Robotics and Systems
- /
- v.16 no.4
- /
- pp.335-338
- /
- 2010
A new method that improves the performance of large scale voice dialling system is presented using speaker adaptation. Since SI (Speaker Independent) based speech recognition system with phoneme HMM uses only the phoneme string of the input sentence, the storage space could be reduced greatly. However, the performance of the system is worse than that of the speaker dependent system due to the mismatch between the input utterance and the SI models. A new method that estimates the phonetic string and adaptation vectors iteratively is presented to reduce the mismatch between the training utterances and a set of SI models using speaker adaptation techniques. For speaker adaptation the stochastic matching methods are used to estimate the adaptation vectors. The experiments performed over actual telephone line shows that proposed method shows better performance as compared to the conventional method. with the SI phonetic recognizer.
https://doi.org/10.5302/J.ICROS.2010.16.4.335 인용 PDF KSCI

A New Speaker Adaptation Technique using Maximum Model Distance

Tahk, Min-Jea
- 제어로봇시스템학회:학술대회논문집
- /
- 2001.10a
- /
- pp.154.2-154
- /
- 2001
This paper presented a adaptation approach based on maximum model distance (MMD) method. This method shares the same framework as they are used for training speech recognizers with abundant training data. The MMD method could adapt to all the models with or without adaptation data. If large amount of adaptation data is available, these methods could gradually approximate the speaker-dependent ones. The approach is evaluated through the phoneme recognition task on the TIMIT corpus. On the speaker adaptation experiments, up to 65.55% phoneme error reduction is achieved. The MMD could reduce phoneme error by 16.91% even when ...
PDF

A STUDY ON THE ADAPTATION OF THE CAST POST (주조 포오스트의 적합도에 관한 연구)

Park, Dong-Kwan;Chang, Ik-Tai
- The Journal of Korean Academy of Prosthodontics
- /
- v.24 no.1
- /
- pp.55-65
- /
- 1986
An in vitro study was performed to evaluate adaptation of custom direct, custom indirect, and prefabricated post system on 15 extracted upper central incisors. 15 specimens were prepared and equally devided into 3 groups under random sampling. Each group of 5 cast posts was made with custom direct, custom indirect, and prefabricated post core method. Gap between inner wall of the dentin and outer wall of the cast post was measured on electron microphotographic prints at x500 magnification. The result were as follows ; 1. No significant difference of adaptation at cervical portion was found between each method. 2. Prefabricated post core method had poor adaptation compared with other methods. 3. Even distribution of adaptation was found in custom direct method between each portion. 4. Prefabricated post core method showed remarkable difference in adaptation between each portion.
PDF

Motion Adaptation Control of 3-D Human Character (3차원 캐릭터의 동작적응 제어 기법)

김상수;국태용
- 제어로봇시스템학회:학술대회논문집
- /
- 2000.10a
- /
- pp.383-383
- /
- 2000
In this paper, a motion adaptation control is applied for animation of 3-D human character. The method includes parameterization of joint motion data, motion adaptation based on body ratio of character, dynamic adaptation using genetic algorithm, etc. The feasibility of motion adaptation technique is verified by applying to motion control and adaptation of a 3-D human character.
PDF

L1-norm Regularization for State Vector Adaptation of Subspace Gaussian Mixture Model (L1-norm regularization을 통한 SGMM의 state vector 적응)

Goo, Jahyun;Kim, Younggwan;Kim, Hoirin
- Phonetics and Speech Sciences
- /
- v.7 no.3
- /
- pp.131-138
- /
- 2015
In this paper, we propose L1-norm regularization for state vector adaptation of subspace Gaussian mixture model (SGMM). When you design a speaker adaptation system with GMM-HMM acoustic model, MAP is the most typical technique to be considered. However, in MAP adaptation procedure, large number of parameters should be updated simultaneously. We can adopt sparse adaptation such as L1-norm regularization or sparse MAP to cope with that, but the performance of sparse adaptation is not good as MAP adaptation. However, SGMM does not suffer a lot from sparse adaptation as GMM-HMM because each Gaussian mean vector in SGMM is defined as a weighted sum of basis vectors, which is much robust to the fluctuation of parameters. Since there are only a few adaptation techniques appropriate for SGMM, our proposed method could be powerful especially when the number of adaptation data is limited. Experimental results show that error reduction rate of the proposed method is better than the result of MAP adaptation of SGMM, even with small adaptation data.
https://doi.org/10.13064/KSSS.2015.7.3.131 인용 PDF KSCI

SVM Based Speaker Verification Using Sparse Maximum A Posteriori Adaptation

Kim, Younggwan;Roh, Jaeyoung;Kim, Hoirin
- IEIE Transactions on Smart Processing and Computing
- /
- v.2 no.5
- /
- pp.277-281
- /
- 2013
Modern speaker verification systems based on support vector machines (SVMs) use Gaussian mixture model (GMM) supervectors as their input feature vectors, and the maximum a posteriori (MAP) adaptation is a conventional method for generating speaker-dependent GMMs by adapting a universal background model (UBM). MAP adaptation requires the appropriate amount of input utterance due to the number of model parameters to be estimated. On the other hand, with limited utterances, unreliable MAP adaptation can be performed, which causes adaptation noise even though the Bayesian priors used in the MAP adaptation smooth the movements between the UBM and speaker dependent GMMs. This paper proposes a sparse MAP adaptation method, which is known to perform well in the automatic speech recognition area. By introducing sparse MAP adaptation to the GMM-SVM-based speaker verification system, the adaptation noise can be mitigated effectively. The proposed method utilizes the L0 norm as a regularizer to induce sparsity. The experimental results on the TIMIT database showed that the sparse MAP-based GMM-SVM speaker verification system yields a 42.6% relative reduction in the equal error rate with few additional computations.
PDF

Search Result 1,360, Processing Time 0.04 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)