• Title/Summary/Keyword: Speaker independent

Search Result 235, Processing Time 0.027 seconds

Speaker Adaptation Using i-Vector Based Clustering

  • Kim, Minsoo;Jang, Gil-Jin;Kim, Ji-Hwan;Lee, Minho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.7
    • /
    • pp.2785-2799
    • /
    • 2020
  • We propose a novel speaker adaptation method using acoustic model clustering. The similarity of different speakers is defined by the cosine distance between their i-vectors (intermediate vectors), and various efficient clustering algorithms are applied to obtain a number of speaker subsets with different characteristics. The speaker-independent model is then retrained with the training data of the individual speaker subsets grouped by the clustering results, and an unknown speech is recognized by the retrained model of the closest cluster. The proposed method is applied to a large-scale speech recognition system implemented by a hybrid hidden Markov model and deep neural network framework. An experiment was conducted to evaluate the word error rates using Resource Management database. When the proposed speaker adaptation method using i-vector based clustering was applied, the performance, as compared to that of the conventional speaker-independent speech recognition model, was improved relatively by as much as 12.2% for the conventional fully neural network, and by as much as 10.5% for the bidirectional long short-term memory.

Speaker Verification Using SVM Kernel with GMM-Supervector Based on the Mahalanobis Distance (Mahalanobis 거리측정 방법 기반의 GMM-Supervector SVM 커널을 이용한 화자인증 방법)

  • Kim, Hyoung-Gook;Shin, Dong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3
    • /
    • pp.216-221
    • /
    • 2010
  • In this paper, we propose speaker verification method using Support Vector Machine (SVM) kernel with Gaussian Mixture Model (GMM)-supervector based on the Mahalanobis distance. The proposed GMM-supervector SVM kernel method is combined GMM with SVM. The GMM-supervectors are generated by GMM parameters of speaker and other speaker utterances. A speaker verification threshold of GMM-supervectors is decided by SVM kernel based on Mahalanobis distance to improve speaker verification accuracy. The experimental results for text-independent speaker verification using 20 speakers demonstrates the performance of the proposed method compared to GMM, SVM, GMM-supervector SVM kernel based on Kullback-Leibler (KL) divergence, and GMM-supervector SVM kernel based on Bhattacharyya distance.

On a Method Which Improves Text Independent Speaker Verification Performance through Limiting Speech Production Loudness (성량제한을 적용한 어구독립 화자증명 성능향상 방안)

  • 이태승;최호진
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10b
    • /
    • pp.457-459
    • /
    • 2001
  • 지속음(continuants) 단위로 화자간 차이를 식별하는 어구독립 화자증명(text-independent speaker verification) 방식에서 입력음성의 성량을 제한하여 보다 높은 인식률을 달성할 수 있는 화자인식 방법을 제안한다.

  • PDF

A Study on Background Speaker Model Design for Portable Speaker Verification Systems (휴대용 화자확인시스템을 위한 배경화자모델 설계에 관한 연구)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.35-43
    • /
    • 2003
  • General speaker verification systems improve their recognition performances by normalizing log likelihood ratio, using a speaker model and its background speaker model that are required to be verified. So these systems rely heavily on the availability of much speaker independent databases for background speaker model design. This constraint, however, may be a burden in practical and portable devices such as palm-top computers or wireless handsets which place a premium on computations and memory. In this paper, new approach for the GMM-based background model design used in portable speaker verification system is presented when the enrollment data is available. This approach is to modify three parameters of GMM speaker model such as mixture weights, means and covariances along with reduced mixture order. According to the experiment on a 20 speaker population from YOHO database, we found that this method had a promise of effective use in a portable speaker verification system.

  • PDF

Speaker Verification Using Hidden LMS Adaptive Filtering Algorithm and Competitive Learning Neural Network (Hidden LMS 적응 필터링 알고리즘을 이용한 경쟁학습 화자검증)

  • Cho, Seong-Won;Kim, Jae-Min
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.51 no.2
    • /
    • pp.69-77
    • /
    • 2002
  • Speaker verification can be classified in two categories, text-dependent speaker verification and text-independent speaker verification. In this paper, we discuss text-dependent speaker verification. Text-dependent speaker verification system determines whether the sound characteristics of the speaker are equal to those of the specific person or not. In this paper we obtain the speaker data using a sound card in various noisy conditions, apply a new Hidden LMS (Least Mean Square) adaptive algorithm to it, and extract LPC (Linear Predictive Coding)-cepstrum coefficients as feature vectors. Finally, we use a competitive learning neural network for speaker verification. The proposed hidden LMS adaptive filter using a neural network reduces noise and enhances features in various noisy conditions. We construct a separate neural network for each speaker, which makes it unnecessary to train the whole network for a new added speaker and makes the system expansion easy. We experimentally prove that the proposed method improves the speaker verification performance.

A Study On Text Independent Speaker Recognition Using Eigenspace (고유영역을 이용한 문자독립형 화자인식에 관한 연구)

  • 함철배;이동규;이두수
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.671-674
    • /
    • 1999
  • We report the new method for speaker recognition. Until now, many researchers have used HMM (Hidden Markov Model) with cepstral coefficient or neural network for speaker recognition. Here, we introduce the method of speaker recognition using eigenspace. This method can reduce the training and recognition time of speaker recognition system. In proposed method, we use the low rank model of the speech eigenspace. In experiment, we obtain good recognition result.

  • PDF

On using the LPC parameter for Speaker Identification (LPC에 의한 화자 식별)

  • 조병모
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1987.11a
    • /
    • pp.82-85
    • /
    • 1987
  • Preliminary results of using the LPC parameter for text-independent speaker identification problem are presented. The idetification process includes log likelihood ratio for distance measure and dynamic programming for time normalization. To generate the data base for experiments, ten times. Experimental results show 99.4% of identification accuracy, incorrect identification were made when the speaker uses a dialect.

  • PDF

Implementation of a Speaker-independent Speech Recognizer Using the TMS320F28335 DSP (TMS320F28335 DSP를 이용한 화자독립 음성인식기 구현)

  • Chung, Ik-Joo
    • Journal of Industrial Technology
    • /
    • v.29 no.A
    • /
    • pp.95-100
    • /
    • 2009
  • In this paper, we implemented a speaker-independent speech recognizer using the TMS320F28335 DSP which is optimized for control applications. For this implementation, we used a small-sized commercial DSP module and developed a peripheral board including a codec, signal conditioning circuits and I/O interfaces. The speech signal digitized by the TLV320AIC23 codec is analyzed based on MFCC feature extraction methed and recognized using the continuous-density HMM. Thanks to the internal SRAM and flash memory on the TMS320F28335 DSP, we did not need any external memory devices. The internal flash memory contains ADPCM data for voice response as well as HMM data. Since the TMS320F28335 DSP is optimized for control applications, the recognizer may play a good role in the voice-activated control areas in aspect that it can integrate speech recognition capability and inherent control functions into the single DSP.

  • PDF

Text-Independent Speaker Identification System Using Speaker Decision Network Based on Delayed Summing (지연누적에 기반한 화자결정회로망이 도입된 구문독립 화자인식시스템)

  • 이종은;최진영
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.82-95
    • /
    • 1998
  • In this paper, we propose a text-independent speaker identification system which has a classifier composed of two parts; to calculate the degree of likeness of each speech frame and to select the most probable speaker from the entire speech duration. The first part is realized using RBFN which is selforganized through learning and in the second part the speaker is determined using a con-tbination of MAXNET and delayed summings. And we use features from linear speech production model and features from fractal geometry. Closed-set speaker identification experiments on 13 male homogeneous speakers show that the proposed techniques can achieve the identification ratio of 100% as the number of delays increases.

  • PDF

Speaker Normalization using Gaussian Mixture Model for Speaker Independent Speech Recognition (화자독립 음성인식을 위한 GMM 기반 화자 정규화)

  • Shin, Ok-Keun
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.437-442
    • /
    • 2005
  • For the purpose of speaker normalization in speaker independent speech recognition systems, experiments are conducted on a method based on Gaussian mixture model(GMM). The method, which is an improvement of the previous study based on vector quantizer, consists of modeling the probability distribution of canonical feature vectors by a GMM with an appropriate number of clusters, and of estimating the warp factor of a test speaker by making use of the obtained probabilistic model. The purpose of this study is twofold: improving the existing ML based methods, and comparing the performance of what is called 'soft decision' method with that of the previous study based on vector quantizer. The effectiveness of the proposed method is investigated by recognition experiments on the TIMIT corpus. The experimental results showed that a little improvement could be obtained tv adjusting the number of clusters in GMM appropriately.