• 제목/요약/키워드: adaptation methods

검색결과 1,146건 처리시간 0.025초

HMM 기반 한국어 음성합성에서의 화자적응 방식 성능비교 및 지속시간 모델 개선 (Performance Comparison and Duration Model Improvement of Speaker Adaptation Methods in HMM-based Korean Speech Synthesis)

  • 이혜민;김형순
    • 말소리와 음성과학
    • /
    • 제4권3호
    • /
    • pp.111-117
    • /
    • 2012
  • In this paper, we compare the performance of several speaker adaptation methods for a HMM-based Korean speech synthesis system with small amounts of adaptation data. According to objective and subjective evaluations, a hybrid method of constrained structural maximum a posteriori linear regression (CSMAPLR) and maximum a posteriori (MAP) adaptation shows better performance than other methods, when only five minutes of adaptation data are available for the target speaker. During the objective evaluation, we find that the duration models are insufficiently adapted to the target speaker as the spectral envelope and pitch models. To alleviate the problem, we propose the duration rectification method and the duration interpolation method. Both the objective and subjective evaluations reveal that the incorporation of the proposed two methods into the conventional speaker adaptation method is effective in improving the performance of the duration model adaptation.

다운증후군 자녀를 둔 가족의 적응력: 혼합적 연구 방법 적용 (Adaptation in Families of Children with Down Syndrome: A Mixed-methods Design)

  • 최현경
    • 대한간호학회지
    • /
    • 제45권4호
    • /
    • pp.501-512
    • /
    • 2015
  • Purpose: The purpose of this study, which was guided by the Resiliency Model of Family Stress, Adjustment, and Adaptation, was twofold: (a) to explore family and parental adaptation and factors influencing family adaptation in Korean families of children with Down syndrome (DS) through a quantitative methodology and (b) to understand the life with a Korean child with DS through a qualitative method. Methods: A mixed-methods design was adopted. A total of 147 parents of children with DS completed a package of questionnaires, and 19 parents participated in the in-depth interviews. Quantitative and qualitative data were analyzed using stepwise multiple regression and content analysis respectively. Results: According to the quantitative data, the overall family adaptation scores indicated average family functioning. Financial status was an important variable in understanding both family and parental adaptation. Family adaptation was best explained by family problem solving and coping communication, condition management ability, and family hardiness. Family strains and family hardiness were the family factors with the most influence on parental adaption. Qualitative data analysis showed that family life with a child with DS encompassed both positive and negative aspects and was expressed with 5 themes, 10 categories, and 16 sub-categories. Conclusion: Results of this study expand our limited knowledge and understanding concerning families of children with DS in Korea and can be used to develop effective interventions to improve the adaptation of family as a unit as well as parental adaptation.

Fast speaker adaptation using extended diagonal linear transformation for deep neural networks

  • Kim, Donghyun;Kim, Sanghun
    • ETRI Journal
    • /
    • 제41권1호
    • /
    • pp.109-116
    • /
    • 2019
  • This paper explores new techniques that are based on a hidden-layer linear transformation for fast speaker adaptation used in deep neural networks (DNNs). Conventional methods using affine transformations are ineffective because they require a relatively large number of parameters to perform. Meanwhile, methods that employ singular-value decomposition (SVD) are utilized because they are effective at reducing adaptive parameters. However, a matrix decomposition is computationally expensive when using online services. We propose the use of an extended diagonal linear transformation method to minimize adaptation parameters without SVD to increase the performance level for tasks that require smaller degrees of adaptation. In Korean large vocabulary continuous speech recognition (LVCSR) tasks, the proposed method shows significant improvements with error-reduction rates of 8.4% and 17.1% in five and 50 conversational sentence adaptations, respectively. Compared with the adaptation methods using SVD, there is an increased recognition performance with fewer parameters.

Language Model Adaptation Based on Topic Probability of Latent Dirichlet Allocation

  • Jeon, Hyung-Bae;Lee, Soo-Young
    • ETRI Journal
    • /
    • 제38권3호
    • /
    • pp.487-493
    • /
    • 2016
  • Two new methods are proposed for an unsupervised adaptation of a language model (LM) with a single sentence for automatic transcription tasks. At the training phase, training documents are clustered by a method known as Latent Dirichlet allocation (LDA), and then a domain-specific LM is trained for each cluster. At the test phase, an adapted LM is presented as a linear mixture of the now trained domain-specific LMs. Unlike previous adaptation methods, the proposed methods fully utilize a trained LDA model for the estimation of weight values, which are then to be assigned to the now trained domain-specific LMs; therefore, the clustering and weight-estimation algorithms of the trained LDA model are reliable. For the continuous speech recognition benchmark tests, the proposed methods outperform other unsupervised LM adaptation methods based on latent semantic analysis, non-negative matrix factorization, and LDA with n-gram counting.

여러 화자 적응 방법들의 특성 비교 (The Comparison of Characteristics in various Speaker Adaptation Methods)

  • 황영수
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1998년도 학술발표대회 논문집 제17권 2호
    • /
    • pp.339-342
    • /
    • 1998
  • In this paper, we proposed various speaker adaptation methods and studied the performance of these methods. Methods which were studied in this paper are MAPE(Maximum A Posteriori Probability Estimation), ARTMAP. In order to evaluate the performance of these methods, we used Korean isolated digits as the experimental data, the hybrid speaker adaptation method, which unfied MAPE, linear spectral estimating and outpur probability of SCHMM, showed the better recognition result than those which performed other methods. And the method using ARTMAP showed the similar result to above hybrid method.

  • PDF

On Speaker Adaptations with Sparse Training Data for Improved Speaker Verification

  • Ahn, Sung-Joo;Kang, Sun-Mee;Ko, Han-Seok
    • 음성과학
    • /
    • 제7권1호
    • /
    • pp.31-37
    • /
    • 2000
  • This paper concerns effective speaker adaptation methods to solve the over-training problem in speaker verification, which frequently occurs when modeling a speaker with sparse training data. While various speaker adaptations have already been applied to speech recognition, these methods have not yet been formally considered in speaker verification. This paper proposes speaker adaptation methods using a combination of MAP and MLLR adaptations, which are successfully used in speech recognition, and applies to speaker verification. Experimental results show that the speaker verification system using a weighted MAP and MLLR adaptation outperforms that of the conventional speaker models without adaptation by a factor of up to 5 times. From these results, we show that the speaker adaptation method achieves significantly better performance even when only small training data is available for speaker verification.

  • PDF

Improving Adversarial Domain Adaptation with Mixup Regularization

  • Bayarchimeg Kalina;Youngbok Cho
    • Journal of information and communication convergence engineering
    • /
    • 제21권2호
    • /
    • pp.139-144
    • /
    • 2023
  • Engineers prefer deep neural networks (DNNs) for solving computer vision problems. However, DNNs pose two major problems. First, neural networks require large amounts of well-labeled data for training. Second, the covariate shift problem is common in computer vision problems. Domain adaptation has been proposed to mitigate this problem. Recent work on adversarial-learning-based unsupervised domain adaptation (UDA) has explained transferability and enabled the model to learn robust features. Despite this advantage, current methods do not guarantee the distinguishability of the latent space unless they consider class-aware information of the target domain. Furthermore, source and target examples alone cannot efficiently extract domain-invariant features from the encoded spaces. To alleviate the problems of existing UDA methods, we propose the mixup regularization in adversarial discriminative domain adaptation (ADDA) method. We validated the effectiveness and generality of the proposed method by performing experiments under three adaptation scenarios: MNIST to USPS, SVHN to MNIST, and MNIST to MNIST-M.

바이어스 보상과 차원별 Eigenvoice 모델 평균을 이용한 고속화자적응의 성능향상 (Performance Improvement of Rapid Speaker Adaptation Using Bias Compensation and Mean of Dimensional Eigenvoice Models)

  • 박종세;김형순;송화전
    • 한국음향학회지
    • /
    • 제23권5호
    • /
    • pp.383-389
    • /
    • 2004
  • 본 논문에서는 훈련 및 인식 환경이 다른 상황에서 eigenvoice 기반 고속화자적응의 성능향상을 위하여 바이어스 보상을 적용한 eigenvoice 적응방식과 차원별 eigenvoice 모델 평균 가중합 방식을 제안하였다. PBW 452 DB를 사용한 어휘독립 단어인식 실험 결과에서 적은 양의 적응데이터를 사용했을 때 제안된 방식이 기존의 eigenvoice 방식에 비하여 많은 성능향상을 얻을 수 있었다. 적응단어 수를 1개에서 50개로 변경시키면서 바이어스 보상을 적용한 eigenvoice 적응방식을 사용한 경우 기존 eigenvoice 방식보다 단어 오인식률이 약 22∼30% 감소하였다. 또한 차원별 eigenvoice 모델 평균을 이용한 eigenvoice 적응방식에서는 1개의 단어를 적응데이터로 사용했을 경우에 기존 eigenvoice 방식보다 단어 오인식률이 최고 41%까지 감소하였다.

중년여성의 폐경기 적응과 양생실천 정도 (Adaptation to Menopause and Use of Yangsaeng in Middle-aged Korean Women)

  • 박혜숙;김애정
    • 여성건강간호학회지
    • /
    • 제16권1호
    • /
    • pp.1-9
    • /
    • 2010
  • Purpose: The study addressed the adaption of middle- aged Korean women to menopause, including the use of Yangsaeng, a traditional health care regimen that incorporates specific principles and methods to promote health and prevent illness, with the aim of improving health and longevity of life. Methods: Middle-aged women (40~59 years, n=171) residing in Seoul and Gyeong-Gi Province. Data was collected by using a self-reported questionnaire. Menopausal period adaptation was measured by 29 items in four categories (physical, self-concept, role function, and inter-dependent). Yangsaeng was measured by 31 questionnaire items in eight categories (morality, mind, diet, activity and rest, exercise, sleep, seasonal, and sexuality). Results: Significant differences in menopausal adaptation were evident on the basis of participant education and income. There were significant differences in Yangsaeng in terms of participant education, nature of employment, and income. Menopausal adaptation positively correlated to use of Yangsaeng. Physical adaptation, self-concept adaptation, role function adaptation, and inter-dependent adaptation positively correlated to morality Yangsaeng, mind Yangsaeng, and activity and rest Yangsaeng. Conclusion: Middle-aged Korean women who practice Yangsaeng may be better positioned to adapt to menopause. Yangsaeng may be an advantageous nursing intervention in this population.

HMM 음성인식 시스템을 위한 화자적응 방법들의 성능비교 (A Comparative Study of Speaker Adaptation Methods for HMM-Based Speech Recognition)

  • 구명완;은종관;이황수
    • 한국음향학회지
    • /
    • 제10권3호
    • /
    • pp.37-43
    • /
    • 1991
  • 본 논문에서는 HMM을 이용한 음성인식 시스템에서 2단계로 이루어지는 화자적응 알고리즘의 성능비교를 수행하였다. 첫단계는 새로운 화자와의 거리차이를 줄여주는 VQ 적응방식들로 구성되는 이 방식들 중에서 lable prototype 적응, 적응음성으로부터 구성된 VQ코우드 북을 사용한 적응 및 사상 코우드 북을 사용한 적응등의 알고리즘 성능비교를 하였다. 두 번째 단계는 새로운 화자를 위해서 HMM 파라미터를 변환시켜주는 HMM 피라미터 적응방식들로 이루어지는데 이 방법들 중에서 Viterbi 알고리즘, DTW 알고리즘, iterative alignment 알고리즘 및 fuzzy histogram 알고리즘의 성능을 비교하였다. 성능비교 결과 fuzzy histogram 알고림즘에 의한 화자적응 방식이 최고의 인식율을 나타내었다.

  • PDF