• Title/Summary/Keyword: GMM Analysis

Search Result 135, Processing Time 0.025 seconds

Speaker Identification Using GMM Based on LPCA (LPCA에 기반한 GMM을 이용한 화자 식별)

  • Seo, Chang-Woo;Lee, Youn-Jeong;Lee, Ki-Yong
    • Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.171-182
    • /
    • 2005
  • An efficient GMM (Gaussian mixture modeling) method based on LPCA (local principal component analysis) with VQ (vector quantization) for speaker identification is proposed. To reduce the dimension and correlation of the feature vector, this paper proposes a speaker identification method based on principal component analysis. The proposed method firstly partitions the data space into several disjoint regions by VQ, and then performs PCA in each region. Finally, the GMM for the speaker is obtained from the transformed feature vectors in each region. Compared to the conventional GMM method with diagonal covariance matrix, the proposed method requires less storage and complexity while maintaining the same performance requires less storage and shows faster results.

  • PDF

Speaker Verification with the Constraint of Limited Data

  • Kumari, Thyamagondlu Renukamurthy Jayanthi;Jayanna, Haradagere Siddaramaiah
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.807-823
    • /
    • 2018
  • Speaker verification system performance depends on the utterance of each speaker. To verify the speaker, important information has to be captured from the utterance. Nowadays under the constraints of limited data, speaker verification has become a challenging task. The testing and training data are in terms of few seconds in limited data. The feature vectors extracted from single frame size and rate (SFSR) analysis is not sufficient for training and testing speakers in speaker verification. This leads to poor speaker modeling during training and may not provide good decision during testing. The problem is to be resolved by increasing feature vectors of training and testing data to the same duration. For that we are using multiple frame size (MFS), multiple frame rate (MFR), and multiple frame size and rate (MFSR) analysis techniques for speaker verification under limited data condition. These analysis techniques relatively extract more feature vector during training and testing and develop improved modeling and testing for limited data. To demonstrate this we have used mel-frequency cepstral coefficients (MFCC) and linear prediction cepstral coefficients (LPCC) as feature. Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM) are used for modeling the speaker. The database used is NIST-2003. The experimental results indicate that, improved performance of MFS, MFR, and MFSR analysis radically better compared with SFSR analysis. The experimental results show that LPCC based MFSR analysis perform better compared to other analysis techniques and feature extraction techniques.

Analysis and Implementation of Speech/Music Classification for 3GPP2 SMV Based on GMM (3GPP2 SMV의 실시간 음성/음악 분류 성능 향상을 위한 Gaussian Mixture Model의 적용)

  • Song, Ji-Hyun;Lee, Kye-Hwan;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.390-396
    • /
    • 2007
  • In this letter, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder(SMV) of 3GPP2 using the Gaussian mixture model(GMM) which is based on the expectation-maximization(EM) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then feature vectors which are applied to the GMM are selected from relevant Parameters of the SMV for the efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

Driver Verification System Using Biometrical GMM Supervector Kernel (생체기반 GMM Supervector Kernel을 이용한 운전자검증 기술)

  • Kim, Hyoung-Gook
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.9 no.3
    • /
    • pp.67-72
    • /
    • 2010
  • This paper presents biometrical driver verification system in car experiment through analysis of speech, and face information. We have used Mel-scale Frequency Cesptral Coefficients (MFCCs) for speaker verification using speech information. For face verification, face region is detected by AdaBoost algorithm and dimension-reduced feature vector is extracted by using principal component analysis only from face region. In this paper, we apply the extracted speech- and face feature vectors to an SVM kernel with Gaussian Mixture Models(GMM) supervector. The experimental results of the proposed approach show a clear improvement compared to a simple GMM or SVM approach.

Detection of Pathological Voice Using Linear Discriminant Analysis

  • Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
    • MALSORI
    • /
    • no.64
    • /
    • pp.77-88
    • /
    • 2007
  • Nowadays, mel-frequency cesptral coefficients (MFCCs) and Gaussian mixture models (GMMs) are used for the pathological voice detection. This paper suggests a method to improve the performance of the pathological/normal voice classification based on the MFCC-based GMM. We analyze the characteristics of the mel frequency-based filterbank energies using the fisher discriminant ratio (FDR). And the feature vectors through the linear discriminant analysis (LDA) transformation of the filterbank energies (FBE) and the MFCCs are implemented. An accuracy is measured by the GMM classifier. This paper shows that the FBE LDA-based GMM is a sufficiently distinct method for the pathological/normal voice classification, with a 96.6% classification performance rate. The proposed method shows better performance than the MFCC-based GMM with noticeable improvement of 54.05% in terms of error reduction.

  • PDF

Efficient Speaker Identification based on Robust VQ-PCA (강인한 VQ-PCA에 기반한 효율적인 화자 식별)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.3
    • /
    • pp.57-62
    • /
    • 2004
  • In this paper, an efficient speaker identification based on robust vector quantizationprincipal component analysis (VQ-PCA) is proposed to solve the problems from outliers and high dimensionality of training feature vectors in speaker identification, Firstly, the proposed method partitions the data space into several disjoint regions by roust VQ based on M-estimation. Secondly, the robust PCA is obtained from the covariance matrix in each region. Finally, our method obtains the Gaussian Mixture model (GMM) for speaker from the transformed feature vectors with reduced dimension by the robust PCA in each region, Compared to the conventional GMM with diagonal covariance matrix, under the same performance, the proposed method gives faster results with less storage and, moreover, shows robust performance to outliers.

  • PDF

On-Road Car Detection System Using VD-GMM 2.0 (차량검출 GMM 2.0을 적용한 도로 위의 차량 검출 시스템 구축)

  • Lee, Okmin;Won, Insu;Lee, Sangmin;Kwon, Jangwoo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.11
    • /
    • pp.2291-2297
    • /
    • 2015
  • This paper presents a vehicle detection system using the video as a input image what has moving of vehicles.. Input image has constraints. it has to get fixed view and downward view obliquely from top of the road. Road detection is required to use only the road area in the input image. In introduction, we suggest the experiment result and the critical point of motion history image extraction method, SIFT(Scale_Invariant Feature Transform) algorithm and histogram analysis to detect vehicles. To solve these problem, we propose using applied Gaussian Mixture Model(GMM) that is the Vehicle Detection GMM(VDGMM). In addition, we optimize VDGMM to detect vehicles more and named VDGMM 2.0. In result of experiment, each precision, recall and F1 rate is 9%, 53%, 15% for GMM without road detection and 85%, 77%, 80% for VDGMM2.0 with road detection.

Enhancement Voiced/Unvoiced Sounds Classification for 3GPP2 SMV Employing GMM (3GPP2 SMV의 실시간 유/무성음 분류 성능 향상을 위한 Gaussian Mixture Model 기반 연구)

  • Song, Ji-Hyun;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.5
    • /
    • pp.111-117
    • /
    • 2008
  • In this paper, we propose an approach to improve the performance of voiced/unvoiced (V/UV) decision under background noise environments for the selectable mode vocoder (SMV) of 3GPP2. We first present an effective analysis of the features and the classification method adopted in the SMV. And then feature vectors which are applied to the GMM are selected from relevant parameters of the SMV for the efficient voiced/unvoiced classification. For the purpose of evaluating the performance of the proposed algorithm, different experiments were carried out under various noise environments and yields better results compared with the conventional scheme of the SMV.

A nonlinear transformation methods for GMM to improve over-smoothing effect

  • Chae, Yi Geun
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.38 no.2
    • /
    • pp.182-187
    • /
    • 2014
  • We propose nonlinear GMM-based transformation functions in an attempt to deal with the over-smoothing effects of linear transformation for voice processing. The proposed methods adopt RBF networks as a local transformation function to overcome the drawbacks of global nonlinear transformation functions. In order to obtain high-quality modifications of speech signals, our voice conversion is implemented using the Harmonic plus Noise Model analysis/synthesis framework. Experimental results are reported on the English corpus, MOCHA-TIMIT.

RPCA-GMM for Speaker Identification (화자식별을 위한 강인한 주성분 분석 가우시안 혼합 모델)

  • 이윤정;서창우;강상기;이기용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.519-527
    • /
    • 2003
  • Speech is much influenced by the existence of outliers which are introduced by such an unexpected happenings as additive background noise, change of speaker's utterance pattern and voice detection errors. These kinds of outliers may result in severe degradation of speaker recognition performance. In this paper, we proposed the GMM based on robust principal component analysis (RPCA-GMM) using M-estimation to solve the problems of both ouliers and high dimensionality of training feature vectors in speaker identification. Firstly, a new feature vector with reduced dimension is obtained by robust PCA obtained from M-estimation. The robust PCA transforms the original dimensional feature vector onto the reduced dimensional linear subspace that is spanned by the leading eigenvectors of the covariance matrix of feature vector. Secondly, the GMM with diagonal covariance matrix is obtained from these transformed feature vectors. We peformed speaker identification experiments to show the effectiveness of the proposed method. We compared the proposed method (RPCA-GMM) with transformed feature vectors to the PCA and the conventional GMM with diagonal matrix. Whenever the portion of outliers increases by every 2%, the proposed method maintains almost same speaker identification rate with 0.03% of little degradation, while the conventional GMM and the PCA shows much degradation of that by 0.65% and 0.55%, respectively This means that our method is more robust to the existence of outlier.