• Title/Summary/Keyword: GMM System

Search Result 194, Processing Time 0.047 seconds

Speaker Identification using Phonetic GMM (음소별 GMM을 이용한 화자식별)

  • Kwon Sukbong;Kim Hoi-Rin
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.185-188
    • /
    • 2003
  • In this paper, we construct phonetic GMM for text-independent speaker identification system. The basic idea is to combine of the advantages of baseline GMM and HMM. GMM is more proper for text-independent speaker identification system. In text-dependent system, HMM do work better. Phonetic GMM represents more sophistgate text-dependent speaker model based on text-independent speaker model. In speaker identification system, phonetic GMM using HMM-based speaker-independent phoneme recognition results in better performance than baseline GMM. In addition to the method, N-best recognition algorithm used to decrease the computation complexity and to be applicable to new speakers.

  • PDF

Performance Enhancement of Speaker Identification System Based on GMM Using the Modified EM Algorithm (수정된 EM알고리즘을 이용한 GMM 화자식별 시스템의 성능향상)

  • Kim, Seong-Jong;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.31-42
    • /
    • 2005
  • Recently, Gaussian Mixture Model (GMM), a special form of CHMM, has been applied to speaker identification and it has proved that performance of GMM is better than CHMM. Therefore, in this paper the speaker models based on GMM and a new GMM using the modified EM algorithm are introduced and evaluated for text-independent speaker identification. Various experiments were performed to evaluate identification performance of two algorithms. As a result of the experiments, the GMM speaker model attained 94.6% identification accuracy using 40 seconds of training data and 32 mixtures and 97.8% accuracy using 80 seconds of training data and 64 mixtures. On the other hand, the new GMM speaker model achieved 95.0% identification accuracy using 40 seconds of training data and 32 mixtures and 98.2% accuracy using 80 seconds of training data and 64 mixtures. It shows that the new GMM speaker identification performance is better than the GMM speaker identification performance.

  • PDF

GMM based Speaker Identification using Pitch Information (피치 정보를 이용한 GMM 기반의 화자 식별)

  • Park Taesun;Hahn Minsoo
    • MALSORI
    • /
    • no.47
    • /
    • pp.121-129
    • /
    • 2003
  • This paper describes the use of pitch information for speaker identification. The recognition system is a GMM based one with 4 connected Korean digits speech database. The mean of the pitch period in voiced sections of speech are shown to be ,useful at discriminating between speakers. Utilizing this feature with Gaussian mixture model in the speaker identification system gave a marked improvement, maximum 6% improvement comparing to the baseline Gaussian mixture model.

  • PDF

GMM-Based Maghreb Dialect Identification System

  • Nour-Eddine, Lachachi;Abdelkader, Adla
    • Journal of Information Processing Systems
    • /
    • v.11 no.1
    • /
    • pp.22-38
    • /
    • 2015
  • While Modern Standard Arabic is the formal spoken and written language of the Arab world; dialects are the major communication mode for everyday life. Therefore, identifying a speaker's dialect is critical in the Arabic-speaking world for speech processing tasks, such as automatic speech recognition or identification. In this paper, we examine two approaches that reduce the Universal Background Model (UBM) in the automatic dialect identification system across the five following Arabic Maghreb dialects: Moroccan, Tunisian, and 3 dialects of the western (Oranian), central (Algiersian), and eastern (Constantinian) regions of Algeria. We applied our approaches to the Maghreb dialect detection domain that contains a collection of 10-second utterances and we compared the performance precision gained against the dialect samples from a baseline GMM-UBM system and the ones from our own improved GMM-UBM system that uses a Reduced UBM algorithm. Our experiments show that our approaches significantly improve identification performance over purely acoustic features with an identification rate of 80.49%.

Effective Recognition of Velopharyngeal Insufficiency (VPI) Patient's Speech Using DNN-HMM-based System (DNN-HMM 기반 시스템을 이용한 효과적인 구개인두부전증 환자 음성 인식)

  • Yoon, Ki-mu;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.1
    • /
    • pp.33-38
    • /
    • 2019
  • This paper proposes an effective recognition method of VPI patient's speech employing DNN-HMM-based speech recognition system, and evaluates the recognition performance compared to GMM-HMM-based system. The proposed method employs speaker adaptation technique to improve VPI speech recognition. This paper proposes to use simulated VPI speech for generating a prior model for speaker adaptation and selective learning of weight matrices of DNN, in order to effectively utilize the small size of VPI speech for model adaptation. We also apply Linear Input Network (LIN) based model adaptation technique for the DNN model. The proposed speaker adaptation method brings 2.35% improvement in average accuracy compared to GMM-HMM based ASR system. The experimental results demonstrate that the proposed DNN-HMM-based speech recognition system is effective for VPI speech with small-sized speech data, compared to conventional GMM-HMM system.

Driver Verification System Using Biometrical GMM Supervector Kernel (생체기반 GMM Supervector Kernel을 이용한 운전자검증 기술)

  • Kim, Hyoung-Gook
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.9 no.3
    • /
    • pp.67-72
    • /
    • 2010
  • This paper presents biometrical driver verification system in car experiment through analysis of speech, and face information. We have used Mel-scale Frequency Cesptral Coefficients (MFCCs) for speaker verification using speech information. For face verification, face region is detected by AdaBoost algorithm and dimension-reduced feature vector is extracted by using principal component analysis only from face region. In this paper, we apply the extracted speech- and face feature vectors to an SVM kernel with Gaussian Mixture Models(GMM) supervector. The experimental results of the proposed approach show a clear improvement compared to a simple GMM or SVM approach.

A Study on the Macroeconomic Effects of Trade Insurance Using Dynamic Panel Models (동태적 패널모형을 통한 무역보험의 거시경제효과 연구)

  • Nam, Sang Wook
    • THE INTERNATIONAL COMMERCE & LAW REVIEW
    • /
    • v.61
    • /
    • pp.165-190
    • /
    • 2014
  • The purpose of this study is to measure the trade insurance's macroeconomic effects by analyzing the causality between major economic variables(GDP per capita, market interest rate, inflation, unemployment rate, exchange rate) and trade insurance variable. I conducted empirical analyses using First-difference GMM(Generalized Method of Moments), System GMM and Panel-VAR Model, with panel data from 11 countries(Korea, United States, Japan, BRICs, Indonesia, Singapore, Hong Kong, Vietnam) between 1992 and 2011. There are several important findings. Above all, Trade insurance is positively and significantly related to GDP. This results show that trade insurance serves to increase economic growth. In other words, trade insurance leads to economic growth by helping increase GDP per capita. Especially, trade insurance negatively related to unemployment rate, it is for sure that trade insurance contribute to decrease unemployment rate. And trade insurance helps control of inflation. It is also confirmed that trade insurance contributes to price stability, which in turn serves to stabilize the overall economy. And this research finds as uncertainty in the market increases, seen it as increase of exchange rate, increasing trade insurance supply is stabilize the exchange rate.

  • PDF

On-Road Car Detection System Using VD-GMM 2.0 (차량검출 GMM 2.0을 적용한 도로 위의 차량 검출 시스템 구축)

  • Lee, Okmin;Won, Insu;Lee, Sangmin;Kwon, Jangwoo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.11
    • /
    • pp.2291-2297
    • /
    • 2015
  • This paper presents a vehicle detection system using the video as a input image what has moving of vehicles.. Input image has constraints. it has to get fixed view and downward view obliquely from top of the road. Road detection is required to use only the road area in the input image. In introduction, we suggest the experiment result and the critical point of motion history image extraction method, SIFT(Scale_Invariant Feature Transform) algorithm and histogram analysis to detect vehicles. To solve these problem, we propose using applied Gaussian Mixture Model(GMM) that is the Vehicle Detection GMM(VDGMM). In addition, we optimize VDGMM to detect vehicles more and named VDGMM 2.0. In result of experiment, each precision, recall and F1 rate is 9%, 53%, 15% for GMM without road detection and 85%, 77%, 80% for VDGMM2.0 with road detection.

Context Recognition Using Environmental Sound for Client Monitoring System (피보호자 모니터링 시스템을 위한 환경음 기반 상황 인식)

  • Ji, Seung-Eun;Jo, Jun-Yeong;Lee, Chung-Keun;Oh, Siwon;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.2
    • /
    • pp.343-350
    • /
    • 2015
  • This paper presents a context recognition method using environmental sound signals, which is applied to a mobile-based client monitoring system. Seven acoustic contexts are defined and the corresponding environmental sound signals are obtained for the experiments. To evaluate the performance of the context recognition, MFCC and LPCC method are employed as feature extraction, and statistical pattern recognition method are used employing GMM and HMM as acoustic models, The experimental results show that LPCC and HMM are more effective at improving context recognition accuracy compared to MFCC and GMM respectively. The recognition system using LPCC and HMM obtains 96.03% in recognition accuracy. These results demonstrate that LPCC is effective to represent environmental sounds which contain more various frequency components compared to human speech. They also prove that HMM is more effective to model the time-varying environmental sounds compared to GMM.

A study on user defined spoken wake-up word recognition system using deep neural network-hidden Markov model hybrid model (Deep neural network-hidden Markov model 하이브리드 구조의 모델을 사용한 사용자 정의 기동어 인식 시스템에 관한 연구)

  • Yoon, Ki-mu;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.2
    • /
    • pp.131-136
    • /
    • 2020
  • Wake Up Word (WUW) is a short utterance used to convert speech recognizer to recognition mode. The WUW defined by the user who actually use the speech recognizer is called user-defined WUW. In this paper, to recognize user-defined WUW, we construct traditional Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), Linear Discriminant Analysis (LDA)-GMM-HMM and LDA-Deep Neural Network (DNN)-HMM based system and compare their performances. Also, to improve recognition accuracy of the WUW system, a threshold method is applied to each model, which significantly reduces the error rate of the WUW recognition and the rejection failure rate of non-WUW simultaneously. For LDA-DNN-HMM system, when the WUW error rate is 9.84 %, the rejection failure rate of non-WUW is 0.0058 %, which is about 4.82 times lower than the LDA-GMM-HMM system. These results demonstrate that LDA-DNN-HMM model developed in this paper proves to be highly effective for constructing user-defined WUW recognition system.