• Title/Summary/Keyword: blind speech segmentation

Search Result 6, Processing Time 0.02 seconds

Performance improvement of text-dependent speaker verification system using blind speech segmentation and energy weight (Blind speech segmentation과 에너지 가중치를 이용한 문장 종속형 화자인식기의 성능 향상)

  • Kim Jung-Gon;Kim Hyung Soon
    • MALSORI
    • /
    • no.47
    • /
    • pp.131-140
    • /
    • 2003
  • We propose a new method of generating client models for HMM based text-dependent speaker verification system with only a small amount of training data. To make a client model, statistical methods such as segmental K-means algorithm are widely used, but they do not guarantee the quality or reliability of a model when only limited data are avaliable. In this paper, we propose a blind speech segmentation based on level building DTW algorithm as an alternative method to make a client model with limited data. In addition, considering the fact that voiced sounds have much more speaker-specific information than unvoiced sounds and energy of the former is higher than that of the latter, we also propose a new score evaluation method using the observation probability raised to the power of weighting factor estimated from the normalized log energy. Our experiment shows that the proposed methods are superior to conventional HMM based speaker verification system.

  • PDF

Utilization of Syllabic Nuclei Location in Korean Speech Segmentation into Phonemic Units (음절핵의 위치정보를 이용한 우리말의 음소경계 추출)

  • 신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.13-19
    • /
    • 2000
  • The blind segmentation method, which segments input speech data into recognition unit without any prior knowledge, plays an important role in continuous speech recognition system and corpus generation. As no prior knowledge is required, this method is rather simple to implement, but in general, it suffers from bad performance when compared to the knowledge-based segmentation method. In this paper, we introduce a method to improve the performance of a blind segmentation of Korean continuous speech by postprocessing the segment boundaries obtained from the blind segmentation. In the preprocessing stage, the candidate boundaries are extracted by a clustering technique based on the GLR(generalized likelihood ratio) distance measure. In the postprocessing stage, the final phoneme boundaries are selected from the candidates by utilizing a simple a priori knowledge on the syllabic structure of Korean, i.e., the maximum number of phonemes between any consecutive nuclei is limited. The experimental result was rather promising : the proposed method yields 25% reduction of insertion error rate compared that of the blind segmentation alone.

  • PDF

Segmentation of continuous Korean Speech Based on Boundaries of Voiced and Unvoiced Sounds (유성음과 무성음의 경계를 이용한 연속 음성의 세그먼테이션)

  • Yu, Gang-Ju;Sin, Uk-Geun
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.7
    • /
    • pp.2246-2253
    • /
    • 2000
  • In this paper, we show that one can enhance the performance of blind segmentation of phoneme boundaries by adopting the knowledge of Korean syllabic structure and the regions of voiced/unvoiced sounds. eh proposed method consists of three processes : the process to extract candidate phoneme boundaries, the process to detect boundaries of voiced/unvoiced sounds, and the process to select final phoneme boundaries. The candidate phoneme boudaries are extracted by clustering method based on similarity between two adjacent clusters. The employed similarity measure in this a process is the ratio of the probability density of adjacent clusters. To detect he boundaries of voiced/unvoiced sounds, we first compute the power density spectrum of speech signal in 0∼400 Hz frequency band. Then the points where this paper density spectrum variation is greater than the threshold are chosen as the boundaries of voiced/unvoiced sounds. The final phoneme boundaries consist of all the candidate phoneme boundaries in voiced region and limited number of candidate phoneme boundaries in unvoiced region. The experimental result showed about 40% decrease of insertion rate compared to the blind segmentation method we adopted.

  • PDF

Segmentation of Continuous Speech based on PCA of Feature Vectors (주요고유성분분석을 이용한 연속음성의 세그멘테이션)

  • 신옥근
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.2
    • /
    • pp.40-45
    • /
    • 2000
  • In speech corpus generation and speech recognition, it is sometimes needed to segment the input speech data without any prior knowledge. A method to accomplish this kind of segmentation, often called as blind segmentation, or acoustic segmentation, is to find boundaries which minimize the Euclidean distances among the feature vectors of each segments. However, the use of this metric alone is prone to errors because of the fluctuations or variations of the feature vectors within a segment. In this paper, we introduce the principal component analysis method to take the trend of feature vectors into consideration, so that the proposed distance measure be the distance between feature vectors and their projected points on the principal components. The proposed distance measure is applied in the LBDP(level building dynamic programming) algorithm for an experimentation of continuous speech segmentation. The result was rather promising, resulting in 3-6% reduction in deletion rate compared to the pure Euclidean measure.

  • PDF

Text-dependent Speaker Verification System Over Telephone Lines (전화망을 위한 어구 종속 화자 확인 시스템)

  • 김유진;정재호
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.663-667
    • /
    • 1999
  • In this paper, we review the conventional speaker verification algorithm and present the text-dependent speaker verification system for application over telephone lines and its result of experiments. We apply blind-segmentation algorithm which segments speech into sub-word unit without linguistic information to the speaker verification system for training speaker model effectively with limited enrollment data. And the World-mode] that is created from PBW DB for score normalization is used. The experiments are presented in implemented system using database, which were constructed to simulate field test, and are shown 3.3% EER.

  • PDF

A Blind Segmentation Algorithm for Speaker Verification System (화자확인 시스템을 위한 분절 알고리즘)

  • 김지운;김유진;민홍기;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.3
    • /
    • pp.45-50
    • /
    • 2000
  • This paper proposes a delta energy method based on Parameter Filtering(PF), which is a speech segmentation algorithm for text dependent speaker verification system over telephone line. Our parametric filter bank adopts a variable bandwidth along with a fixed center frequency. Comparing with other methods, the proposed method turns out very robust to channel noise and background noise. Using this method, we segment an utterance into consecutive subword units, and make models using each subword nit. In terms of EER, the speaker verification system based on whole word model represents 6.1%, whereas the speaker verification system based on subword model represents 4.0%, improving about 2% in EER.

  • PDF