• Title/Summary/Keyword: Speaker Recognition

Search Result 554, Processing Time 0.025 seconds

Speaker Recognition using PCA in Driving Car Environments (PCA를 이용한 자동차 주행 환경에서의 화자인식)

  • Yu, Ha-Jin
    • Proceedings of the KSPS conference
    • /
    • 2005.04a
    • /
    • pp.103-106
    • /
    • 2005
  • The goal of our research is to build a text independent speaker recognition system that can be used in any condition without any additional adaptation process. The performance of speaker recognition systems can be severally degraded in some unknown mismatched microphone and noise conditions. In this paper, we show that PCA(Principal component analysis) without dimension reduction can greatly increase the performance to a level close to matched condition. The error rate is reduced more by the proposed augmented PCA, which augment an axis to the feature vectors of the most confusable pairs of speakers before PCA

  • PDF

Extraction of Speaker Recognition Parameter Using Chaos Dimension (카오스차원에 의한 화자식별 파라미터 추출)

  • Yoo, Byong-Wook;Kim, Chang-Seok
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.285-293
    • /
    • 1997
  • This paper was constructed to investigate strange attractor in considering speech which is regarded as chaos in that the random signal appears in the deterministic raising system. This paper searches for the delay time from AR model power spectrum for constructing fit attractor for speech signal. As a result of applying Taken's embedding theory to the delay time, an exact correlation dimension solution is obtained. As a result of this consideration of speech, it is found that it has more speaker recognition characteristic parameter, and gains a large speaker discrimination recognition rate.

  • PDF

An improved spectrum mapping applied to speaker adaptive Kroean word recognition

  • Matsumoto, Hiroshi;Lee, Yong-Ju;Kim, Hoi-Rim;Kido, Ken'iti
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1009-1014
    • /
    • 1994
  • This paper improves the previously proposed spectral mapping method for supervised speaker adaptation in which a mapped spectrum is interpolated from speaker difference vectors at typical spectra based on a minimized distortion criterion. In estimating these difference vectors, it is important to find an appropriate number of typical points. The previous method empirically adjusts the number of typical points, while the present method optimizes the effective number by rank reduction of normal equation. This algorithm was applied to a supervised speaker adaptation for Korean word recognition using the templates form a prototype male speaker. The result showed that the rank reduction technique not only can automatically determine an optimal number of code vectors, but also slightly improves the recognition scores compared with those obtained by the previous method.

  • PDF

Development of a Door System by Speaker Verification Using Weighted Cepstrum and Single Average Pattern

  • Kyung, Youn-Jeong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.2E
    • /
    • pp.60-68
    • /
    • 1996
  • In this paper, we implement the door lock system based on pattern matching technique for speaker recognition using DTW. In this study, major features of our system are summarized as follows:(1) Make the average reference pattern using DTW. This method keeps the high recognition rate compared with the other systems whose performances degrade rapidly as time goes on. (2) Use F-ratio values of the cepstral coefficients. We find that the weighted cepstral reveals an effect on intensifying the difference between th customer and the imposter. The system hardware is composed of two parts : the door lock part and the speaker recognition processing part. We use an 8051 microprocessor in the door lock park for serial communication with host processor to open or close the lock. Using our system, we obtain speaker recognition rate of about 99.5%.

  • PDF

Performance comparison of Text-Independent Speaker Recognizer Using VQ and GMM (VQ와 GMM을 이용한 문맥독립 화자인식기의 성능 비교)

  • Kim, Seong-Jong;Chung, Hoon;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.235-244
    • /
    • 2000
  • This paper was focused on realizing the text-independent speaker recognizer using the VQ and GMM algorithm and studying the characteristics of the speaker recognizers that adopt these two algorithms. Because it was difficult ascertain the effect two algorithms have on the speaker recognizer theoretically, we performed the recognition experiments using various parameters and, as the result of the experiments, we could show that GMM algorithm had better recognition performance than VQ algorithm as following. The GMM showed better performance with small training data, and it also showed just a little difference of recognition rate as the kind of feature vectors and the length of input data vary. The GMM showed good recognition performance than the VQ on the whole.

  • PDF

Changes in Features of Korean Vowels with Age and Sex of Speakers and Their Recognition (한국어 단모음의 성별, 연령별 특징변화 및 인식)

  • 이용주;김경태;차균현
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.12
    • /
    • pp.1503-1512
    • /
    • 1988
  • As the basic analysis to solve the within-and cross-speaker variability in phoneme based speech recognition, changes in pitch and formant frequencies of 8 Korean vowels with age and sex of speaker has been investigated by analyzing a large number fo samples. Conclusions obtained are as follows: 1) Changes in pitch frequency with age and sex of speaker for children are hard to distinguish and the difference of before and after the voice change is analyzed approximately 0.2 oct. for female an 0.9 oct. for male. 2) While most of the formants of vowel considerably change with the age of speaker, the change becomes smaller as the age becomes older. 3) While there is an indirect correlation between pitch and formant with change in age, it is hard to see a direct correlation. 4) When the objects of the recognition experiment by pitch and formants are various speakers in each age and sex, pitch also works as an efficient recognition parameter.

  • PDF

An Amplitude Warping Approach to Intra-Speaker Normalization for Speech Recognition (음성인식에서 화자 내 정규화를 위한 진폭 변경 방법)

  • Kim Dong-Hyun;Hong Kwang-Seok
    • Journal of Internet Computing and Services
    • /
    • v.4 no.3
    • /
    • pp.9-14
    • /
    • 2003
  • The method of vocal tract normalization is a successful method for improving the accuracy of inter-speaker normalization. In this paper, we present an intra-speaker warping factor estimation based on pitch alteration utterance. The feature space distributions of untransformed speech from the pitch alteration utterance of intra-speaker would vary due to the acoustic differences of speech produced by glottis and vocal tract. The variation of utterance is two types: frequency and amplitude variation. The vocal tract normalization is frequency normalization among inter-speaker normalization methods. Therefore, we have to consider amplitude variation, and it may be possible to determine the amplitude warping factor by calculating the inverse ratio of input to reference pitch. k, the recognition results, the error rate is reduced from 0.4% to 2.3% for digit and word decoding.

  • PDF

Large Scale Voice Dialling using Speaker Adaptation (화자 적응을 이용한 대용량 음성 다이얼링)

  • Kim, Weon-Goo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.16 no.4
    • /
    • pp.335-338
    • /
    • 2010
  • A new method that improves the performance of large scale voice dialling system is presented using speaker adaptation. Since SI (Speaker Independent) based speech recognition system with phoneme HMM uses only the phoneme string of the input sentence, the storage space could be reduced greatly. However, the performance of the system is worse than that of the speaker dependent system due to the mismatch between the input utterance and the SI models. A new method that estimates the phonetic string and adaptation vectors iteratively is presented to reduce the mismatch between the training utterances and a set of SI models using speaker adaptation techniques. For speaker adaptation the stochastic matching methods are used to estimate the adaptation vectors. The experiments performed over actual telephone line shows that proposed method shows better performance as compared to the conventional method. with the SI phonetic recognizer.

Rapid Speaker Adaptation for Continuous Speech Recognition Using Merging Eigenvoices (Eigenvoice 병합을 이용한 연속 음성 인식 시스템의 고속 화자 적응)

  • Choi, Dong-Jin;Oh, Yung-Hwan
    • MALSORI
    • /
    • no.53
    • /
    • pp.143-156
    • /
    • 2005
  • Speaker adaptation in eigenvoice space is a popular method for rapid speaker adaptation. To improve the performance of the method, the number of speaker dependent models should be increased and eigenvoices should be re-estimated. However, principal component analysis takes much time to find eigenvoices, especially in a continuous speech recognition system. This paper describes a method to reduce computation time to estimate eigenvoices only for supplementary speaker dependent models and to merge them with the used eigenvoices. Experiment results show that the computation time is reduced by 73.7% while the performance is almost the same in case that the number of speaker dependent models is the same as used ones.

  • PDF

A Study on the Isolated word Recognition Using One-Stage DMS/DP for the Implementation of Voice Dialing System

  • Seong-Kwon Lee
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1039-1045
    • /
    • 1994
  • The speech recognition systems using VQ have usually the problem decreasing recognition rate, MSVQ assigning the dissimilar vectors to a segment. In this paper, applying One-stage DMS/DP algorithm to the recognition experiments, we can solve these problems to what degree. Recognition experiment is peformed for Korean DDD area names with DMS model of 20 sections and word unit template. We carried out the experiment in speaker dependent and speaker independent, and get a recognition rates of 97.7% and 81.7% respectively.

  • PDF