• Title/Summary/Keyword: 화자 확인

Search Result 245, Processing Time 0.025 seconds

Performance Improvement of Speaker Recognition System Using Genetic Algorithm (유전자 알고리즘을 이용한 화자인식 시스템 성능 향상)

  • 문인섭;김종교
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.8
    • /
    • pp.63-67
    • /
    • 2000
  • This paper deals with text-prompt speaker recognition based on dynamic time warping (DTW). The Genetic Algorithm was applied to the creation of reference patterns for suitable reflection of the speaker characteristics, one of the most important determinants in the fields of speaker recognition. In order to overcome the weakness of text-dependent and text-independent speaker recognition, the text-prompt type was suggested. Performed speaker identification and verification in close and open set respectively, hence the Genetic algorithm-based reference patterns had been proven to have better performance in both recognition rate and speed than that of conventional reference patterns.

  • PDF

On Codebook Fesign to Improve Speaker Adaptation (화자 적응 성능 향상을 위한 코드북 설계)

  • 양태영
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.228-231
    • /
    • 1995
  • 반연속 HMM 음성인식 시스템의 화자 적응 성능 향상을 위해 코드북 변환 알고리즘을 제안하였다. 기존의 화자 적응 알고리즘으로는 새로운 화자의 적응 데이터 특징의 분포와 HMM 모수의 사전밀도를 함께 고려하는 베이시안 화자적응 알고리즘이 있다. 그러나 새로운 화자의 특징분포와 코드북 사전 밀도의 차이가 큰 경우 적응 데이터와 코드북간의 잘못된 대응 관계를 얻을 수 있으며, 기준 코드북에 필요 이상으로 많은 코드워드가 존재하는 경우 적응된 코드북에도 불필요한 코드워드 들이 남아 인식 과정에 혼란을 줄 수 있다. 이 문제점을 해결하기 위하여 제안된 코드북 변환 알고리즘에서는 주파수 영역의 포만트 정보를 이용하였다. 화자 적응을 수행하기 앞서 코드북의 켑스트럼으로부터 포만트를 추출해 내고, 이들의 분포를 적응 화자의 포만트 분포와 일치되도록 변환시켜 주었다. 이 변환된 포만트들로부터 다시 켑스트럼을 구하여 변환된 코드북을 얻고 이를 화자 적응의 초기 코드북으로 사용하였다. 제안된 알고리즘을 이용하였을 경우 코드북과 적응 화자의 음성 간의 정확한 대응관계를 찾을 수 있었고, 불필요한 코드워드들이 인식 과정에서 사용되지 않도록 변환되어 인식률이 향상되는 것을 실험을 통해 확인하였다.

  • PDF

An Enhanced Text-Prompt Speaker Recognition Using DTW (DTW를 이용한 향상된 문맥 제시형 화자인식)

  • 신유식;서광석;김종교
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.1
    • /
    • pp.86-91
    • /
    • 1999
  • This paper presents the text-prompt method to overcome the weakness of text-dependent and text-independent speaker recognition. Enhanced dynamic time warping for speaker recognition algorithm is applied. For the real-time processing, we use a simple algorithm for end-point detection without increasing computational complexity. The test shows that the weighted-cepstrum is most proper for speaker recognition among various speech parameters. As the experimental results of the proposed algorithm for three prompt words, the speaker identification error rate is 0.02%, and when the threshold is set properly, false rejection rate is 1.89%, false acceptance rate is 0.77% and verification total error rate is 0.97% for speaker verification.

  • PDF

Speaker Identification Using Dynamic Time Warping Algorithm (동적 시간 신축 알고리즘을 이용한 화자 식별)

  • Jeong, Seung-Do
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.5
    • /
    • pp.2402-2409
    • /
    • 2011
  • The voice has distinguishable acoustic properties of speaker as well as transmitting information. The speaker recognition is the method to figures out who speaks the words through acoustic differences between speakers. The speaker recognition is roughly divided two kinds of categories: speaker verification and identification. The speaker verification is the method which verifies speaker himself based on only one's voice. Otherwise, the speaker identification is the method to find speaker by searching most similar model in the database previously consisted of multiple subordinate sentences. This paper composes feature vector from extracting MFCC coefficients and uses the dynamic time warping algorithm to compare the similarity between features. In order to describe common characteristic based on phonological features of spoken words, two subordinate sentences for each speaker are used as the training data. Thus, it is possible to identify the speaker who didn't say the same word which is previously stored in the database.

A PCA-based MFDWC Feature Parameter for Speaker Verification System (화자 검증 시스템을 위한 PCA 기반 MFDWC 특징 파라미터)

  • Hahm Seong-Jun;Jung Ho-Youl;Chung Hyun-Yeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.1
    • /
    • pp.36-42
    • /
    • 2006
  • A Principal component analysis (PCA)-based Mel-Frequency Discrete Wavelet Coefficients (MFDWC) feature Parameters for speaker verification system is Presented in this Paper In this method, we used the 1st-eigenvector obtained from PCA to calculate the energy of each node of level that was approximated by. met-scale. This eigenvector satisfies the constraint of general weighting function that the squared sum of each component of weighting function is unity and is considered to represent speaker's characteristic closely because the 1st-eigenvector of each speaker is fairly different from the others. For verification. we used Universal Background Model (UBM) approach that compares claimed speaker s model with UBM on frame-level. We performed experiments to test the effectiveness of PCA-based parameter and found that our Proposed Parameters could obtain improved average Performance of $0.80\%$compared to MFCC. $5.14\%$ to LPCC and 6.69 to existing MFDWC.

Modified Weighting Model Rank Method for Improving the Performance of Real-Time Text-Independent Speaker Recognition System (실시간 문맥독립 화자인식 시스템의 성능향상을 위한 수정된 가중모델순위 결정방법)

  • Kim Min-Joung;Oh Se-Jin;Suk Su-Young;Chung Ho-Youl;Chung Hyun-Yeol
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.107-110
    • /
    • 2002
  • 현재까지 개발된 화자식별 시스템 중 가중모델순위(Weighting Model Rank; WMR)방법을 이용한 화자인식 시스템이 비교적 높은 인식성능을 나타내고 있다. WMR 방법은 각 화자에 대한 프레임 유사도의 순위에 따라 지수함수 가중치로 대치시키는 방법을 사용하고 있으나, 이 방법은 유사도 본래의 변별력이 전체 계산에서 고려되지 않는 문제가 있었다. 이를 해결하기 위해 본 논문에서는 각 화자의 프레임 유사도와 지수함수를 이용한 가중치를 곱한 값을 이용하여 전체 스코어를 계산하도록 하는 수정된 가중모델 순위방법(Modified Weighting Model Rank; MWMR)을 제안한다. 제안한 방법의 유효성을 확인하기 위하여 316명의 화자를 대상으로 하여 인식실험을 실시한 결과, 학습 프레임이 10,000일 경우, MWMR 방법에서 $98.1\%$의 화자 인식률을 얻어 WMR 방법에 비해 약 $2.0\%$의 향상된 인식결과를 보여 제안한 방법의 유효성을 확인할 수 있었다.

  • PDF

Performance Enhancement for Speaker Verification Using Incremental Robust Adaptation in GMM (가무시안 혼합모델에서 점진적 강인적응을 통한 화자확인 성능개선)

  • Kim, Eun-Young;Seo, Chang-Woo;Lim, Yong-Hwan;Jeon, Seong-Chae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.268-272
    • /
    • 2009
  • In this paper, we propose a Gaussian Mixture Model (GMM) based incremental robust adaptation with a forgetting factor for the speaker verification. Speaker recognition system uses a speaker model adaptation method with small amounts of data in order to obtain a good performance. However, a conventional adaptation method has vulnerable to the outlier from the irregular utterance variations and the presence noise, which results in inaccurate speaker model. As time goes by, a rate in which new data are adapted to a model is reduced. The proposed algorithm uses an incremental robust adaptation in order to reduce effect of outlier and use forgetting factor in order to maintain adaptive rate of new data on GMM based speaker model. The incremental robust adaptation uses a method which registers small amount of data in a speaker recognition model and adapts a model to new data to be tested. Experimental results from the data set gathered over seven months show that the proposed algorithm is robust against outliers and maintains adaptive rate of new data.

A Study on Developing Speaker Recognition System In Driving Car Environment (자동차 주행 환경에서의 화자인식 시스템 개발에 관한 연구)

  • Yang, Joon-Young;Chang, Joon-Hyuk;Lee, Chang Won;Park, Ki-Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.934-936
    • /
    • 2017
  • 화자인식 기술은 등록된 화자 목록 내 화자 또는 사칭 화자의 발화로부터 발화자를 식별하는 기술로써, 음성 소스를 기반으로 동작하는 디바이스의 개인화를 위해 필요한 기술이다. 본 논문에서는 차량 잡음이 존재하는 자동차 주행 환경을 타겟으로 하는 화자인식 시스템 개발 방법을 제안한다. 차량 잡음에 의해 오염된 음성신호로부터 잡음 성분을 제거하기 위해 parametric multi-channel Wiener filter (PWMF)를 이용하여 실험한 결과, 남성화자 조건에서는 PMWF의 내부 파라미터 조절을 통해 필터를 minimum variance distortionless response (MVDR) 빔포머로 동작하도록 설정하였을 때, 여성화자 조건에서는 잡음을 제거하지 않았을 때 가장 낮은 동일오류율을 보임을 확인할 수 있었다.

Speaker Verification Performance Improvement Using Weighted Residual Cepstrum (가중된 예측 오차 파라미터를 사용한 화자 확인 성능 개선)

  • 위진우;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.5
    • /
    • pp.48-53
    • /
    • 2001
  • In speaker verification based on LPC analysis the prediction residues are ignored and LPCC(LPC cepstrum) are only used to compose feature vectors. In this study, LPCC and RCEP (residual cepstrum) extracted from residues are used as feature parameters in the various environmental speaker verification. We propose the weighting function which can enlarge inter-speaker variation by weighting pitch, speaker inherent vector, included in residual cepstrum. Simulation results show that the average speaker verification rate is improved in the rate of 6% with RCEP and LPCC at the same time and is improved in the rate of 2.45% with the proposed weighted RCEP and LPCC at the same time compared with no weighting.

  • PDF

A Blind Segmentation Algorithm for Speaker Verification System (화자확인 시스템을 위한 분절 알고리즘)

  • 김지운;김유진;민홍기;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.45-50
    • /
    • 2000
  • This paper proposes a delta energy method based on Parameter Filtering(PF), which is a speech segmentation algorithm for text dependent speaker verification system over telephone line. Our parametric filter bank adopts a variable bandwidth along with a fixed center frequency. Comparing with other methods, the proposed method turns out very robust to channel noise and background noise. Using this method, we segment an utterance into consecutive subword units, and make models using each subword nit. In terms of EER, the speaker verification system based on whole word model represents 6.1%, whereas the speaker verification system based on subword model represents 4.0%, improving about 2% in EER.

  • PDF