• Title/Summary/Keyword: Speaker independent

Search Result 235, Processing Time 0.023 seconds

Robust Speaker Identification using Independent Component Analysis (독립성분 분석을 이용한 강인한 화자식별)

  • Jang, Gil-Jin;Oh, Yung-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.5
    • /
    • pp.583-592
    • /
    • 2000
  • This paper proposes feature parameter transformation method using independent component analysis (ICA) for speaker identification. The proposed method assumes that the cepstral vectors from various channel-conditioned speech are constructed by a linear combination of some characteristic functions with random channel noise added, and transforms them into new vectors using ICA. The resultant vector space can give emphasis to the repetitive speaker information and suppress the random channel distortions. Experimental results show that the transformation method is effective for the improvement of speaker identification system.

  • PDF

Performance comparison of Text-Independent Speaker Recognizer Using VQ and GMM (VQ와 GMM을 이용한 문맥독립 화자인식기의 성능 비교)

  • Kim, Seong-Jong;Chung, Hoon;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.235-244
    • /
    • 2000
  • This paper was focused on realizing the text-independent speaker recognizer using the VQ and GMM algorithm and studying the characteristics of the speaker recognizers that adopt these two algorithms. Because it was difficult ascertain the effect two algorithms have on the speaker recognizer theoretically, we performed the recognition experiments using various parameters and, as the result of the experiments, we could show that GMM algorithm had better recognition performance than VQ algorithm as following. The GMM showed better performance with small training data, and it also showed just a little difference of recognition rate as the kind of feature vectors and the length of input data vary. The GMM showed good recognition performance than the VQ on the whole.

  • PDF

A study on the spoken digit recognition performance of the Two-Stage recurrent neural network (2단 회귀신경망의 숫자음 인식에관한 연구)

  • 안점영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.25 no.3B
    • /
    • pp.565-569
    • /
    • 2000
  • We compose the two-stage recurrent neural network that returns both signals of a hidden and an output layer to the hidden layer. It is tested on the basis of syllables for Korean spoken digit from /gong/to /gu. For these experiments, we adjust the neuron number of the hidden layer, the predictive order of input data and self-recurrent coefficient of the decision state layer. By the experimental results, the recognition rate of this neural network is between 91% and 97.5% in the speaker-dependent case and between 80.75% and 92% in the speaker-independent case. In the speaker-dependent case, this network shows an equivalent recognition performance to Jordan and Elman network but in the speaker-independent case, it does improved performance.

  • PDF

Reduction of Dimension of HMM parameters in MLLR Framework for Speaker Adaptation (화자적응시스템을 위한 MLLR 알고리즘 연산량 감소)

  • Kim Ji Un;Jeong Jae Ho
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.123-126
    • /
    • 2003
  • We discuss how to reduce the number of inverse matrix and its dimensions requested in MLLR framework for speaker adaptation. To find a smaller set of variables with less redundancy, we employ PCA(principal component analysis) and ICA(independent component analysis) that would give as good a representation as possible. The amount of additional computation when PCA or ICA is applied is as small as it can be disregarded. The dimension of HMM parameters is reduced to about 1/3 ~ 2/7 dimensions of SI(speaker independent) model parameter with which speech recognition system represents word recognition rate as much as ordinary MLLR framework. If dimension of SI model parameter is n, the amount of computation of inverse matrix in MLLR is proportioned to O($n^4$). So, compared with ordinary MLLR, the amount of total computation requested in speaker adaptation is reduced to about 1/80~1/150.

  • PDF

On the Simple Speaker Verification System Using Tolerance Interval Analysis Without Background Speaker Models (Tolerance Interval Analysis를 이용한 배경화자 없는 간단한 화자인증시스템에 관한 연구)

  • Choi, Hong-Sub
    • MALSORI
    • /
    • no.56
    • /
    • pp.147-158
    • /
    • 2005
  • In this paper, we are focused to develop the simplified speaker verification algorithm without background speaker models, which will be adopted in the portable speaker verification system equipped in portable terminals such as mobile phone and PMP. According to the tolerance interval analysis, the population of someone's speaker model can be represented by a suitable number of selected independent samples of speaker model. So we can make the representative speaker model and threshold under the specified confidence level and coverage. Using proposed algorithm with the number of samples is 40, the experiments show that the false rejection rate is $3.0\%$ and the false acceptance rate $4.3\%$, worth comparing to conventional method's results, $5.4\%\;and\;5.5\%$, respectively. Next step of research will be on the suitable adaptation methods to overcome speech variation problems due to aging effect and operating environments.

  • PDF

Performance Enhancement of Speaker Identification System Based on GMM Using the Modified EM Algorithm (수정된 EM알고리즘을 이용한 GMM 화자식별 시스템의 성능향상)

  • Kim, Seong-Jong;Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.31-42
    • /
    • 2005
  • Recently, Gaussian Mixture Model (GMM), a special form of CHMM, has been applied to speaker identification and it has proved that performance of GMM is better than CHMM. Therefore, in this paper the speaker models based on GMM and a new GMM using the modified EM algorithm are introduced and evaluated for text-independent speaker identification. Various experiments were performed to evaluate identification performance of two algorithms. As a result of the experiments, the GMM speaker model attained 94.6% identification accuracy using 40 seconds of training data and 32 mixtures and 97.8% accuracy using 80 seconds of training data and 64 mixtures. On the other hand, the new GMM speaker model achieved 95.0% identification accuracy using 40 seconds of training data and 32 mixtures and 98.2% accuracy using 80 seconds of training data and 64 mixtures. It shows that the new GMM speaker identification performance is better than the GMM speaker identification performance.

  • PDF

Speaker Identification Using Augmented PCA in Unknown Environments (부가 주성분분석을 이용한 미지의 환경에서의 화자식별)

  • Yu, Ha-Jin
    • MALSORI
    • /
    • no.54
    • /
    • pp.73-83
    • /
    • 2005
  • The goal of our research is to build a text-independent speaker identification system that can be used in any condition without any additional adaptation process. The performance of speaker recognition systems can be severely degraded in some unknown mismatched microphone and noise conditions. In this paper, we show that PCA(principal component analysis) can improve the performance in the situation. We also propose an augmented PCA process, which augments class discriminative information to the original feature vectors before PCA transformation and selects the best direction for each pair of highly confusable speakers. The proposed method reduced the relative recognition error by 21%.

  • PDF

Context-Independent Speaker Recognition in URC Environment (지능형 서비스 로봇을 위한 문맥독립 화자인식 시스템)

  • Ji, Mi-Kyong;Kim, Sung-Tak;Kim, Hoi-Rin
    • The Journal of Korea Robotics Society
    • /
    • v.1 no.2
    • /
    • pp.158-162
    • /
    • 2006
  • This paper presents a speaker recognition system intended for use in human-robot interaction. The proposed speaker recognition system can achieve significantly high performance in the Ubiquitous Robot Companion (URC) environment. The URC concept is a scenario in which a robot is connected to a server through a broadband connection allowing functions to be performed on the server side, thereby minimizing the stand-alone function significantly and reducing the robot client cost. Instead of giving a robot (client) on-board cognitive capabilities, the sensing and processing work are outsourced to a central computer (server) connected to the high-speed Internet, with only the moving capability provided by the robot. Our aim is to enhance human-robot interaction by increasing the performance of speaker recognition with multiple microphones on the robot side in adverse distant-talking environments. Our speaker recognizer provides the URC project with a basic interface for human-robot interaction.

  • PDF

Speaker Identification in Small Training Data Environment using MLLR Adaptation Method (MLLR 화자적응 기법을 이용한 적은 학습자료 환경의 화자식별)

  • Kim, Se-hyun;Oh, Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.159-162
    • /
    • 2005
  • Identification is the process automatically identify who is speaking on the basis of information obtained from speech waves. In training phase, each speaker models are trained using each speaker's speech data. GMMs (Gaussian Mixture Models), which have been successfully applied to speaker modeling in text-independent speaker identification, are not efficient in insufficient training data environment. This paper proposes speaker modeling method using MLLR (Maximum Likelihood Linear Regression) method which is used for speaker adaptation in speech recognition. We make SD-like model using MLLR adaptation method instead of speaker dependent model (SD). Proposed system outperforms the GMMs in small training data environment.

  • PDF

Implementation of Speaker Independent Speech Recognition System Using Independent Component Analysis based on DSP (독립성분분석을 이용한 DSP 기반의 화자 독립 음성 인식 시스템의 구현)

  • 김창근;박진영;박정원;이광석;허강인
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.2
    • /
    • pp.359-364
    • /
    • 2004
  • In this paper, we implemented real-time speaker undependent speech recognizer that is robust in noise environment using DSP(Digital Signal Processor). Implemented system is composed of TMS320C32 that is floating-point DSP of Texas Instrument Inc. and CODEC for real-time speech input. Speech feature parameter of the speech recognizer used robust feature parameter in noise environment that is transformed feature space of MFCC(met frequency cepstral coefficient) using ICA(Independent Component Analysis) on behalf of MFCC. In recognition result in noise environment, we hew that recognition performance of ICA feature parameter is superior than that of MFCC.