• 제목/요약/키워드: Phonetic Distance

검색결과 42건 처리시간 0.022초

화자인식을 위한 강인한 끝점 검출 알고리즘 (Robust Endpoint Detection Algorithm For Speaker Verification)

  • 정대성;김정곤;김형순
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.137-140
    • /
    • 2003
  • In this paper, we propose a robust endpoint detection algorithm for speaker verification. Proposed algorithm uses energy and cepstral distance parameters, and it replaces the detected endpoints with endpoints of voiced speech, when the estimated signal-to-noise ratio (SNR) is low. Experimental results show that proposed algorithm is superior to energy-based endpoint detection algorithm.

  • PDF

음성/음악 판별을 위한 특징 파라미터와 분류기의 성능비교 (Performance Comparison of Feature Parameters and Classifiers for Speech/Music Discrimination)

  • 김형순;김수미
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.37-50
    • /
    • 2003
  • In this paper, we evaluate and compare the performance of speech/music discrimination based on various feature parameters and classifiers. As for feature parameters, we consider High Zero Crossing Rate Ratio (HZCRR), Low Short Time Energy Ratio (LSTER), Spectral Flux (SF), Line Spectral Pair (LSP) distance, entropy and dynamism. We also examine three classifiers: k Nearest Neighbor (k-NN), Gaussian Mixure Model (GMM), and Hidden Markov Model (HMM). According to our experiments, LSP distance and phoneme-recognizer-based feature set (entropy and dunamism) show good performance, while performance differences due to different classifiers are not significant. When all the six feature parameters are employed, average speech/music discrimination accuracy up to 96.6% is achieved.

  • PDF

코퍼스 기반 음성합성기를 위한 합성단위 경계 스펙트럼 평탄화 알고리즘 (A Spectral Smoothing Algorithm for Unit Concatenating Speech Synthesis)

  • 김상진;장경애;한민수
    • 대한음성학회지:말소리
    • /
    • 제56호
    • /
    • pp.225-235
    • /
    • 2005
  • Speech unit concatenation with a large database is presently the most popular method for speech synthesis. In this approach, the mismatches at the unit boundaries are unavoidable and become one of the reasons for quality degradation. This paper proposes an algorithm to reduce undesired discontinuities between the subsequent units. Optimal matching points are calculated in two steps. Firstly, the fullback-Leibler distance measurement is utilized for the spectral matching, then the unit sliding and the overlap windowing are used for the waveform matching. The proposed algorithm is implemented for the corpus-based unit concatenating Korean text-to-speech system that has an automatically labeled database. Experimental results show that our algorithm is fairly better than the raw concatenation or the overlap smoothing method.

  • PDF

휴대용 단말기에서 음원 위치 추적 기술 비교 연구 (A Comparative Study of Sound Source Localization Algorithms for Portable Devices)

  • 정재연;육동석
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.49-52
    • /
    • 2006
  • The performance of a sound source localization system degrades severely in reverberant and noisy environments. In addition, restriction on the distance between microphones, which is required by portable devices, also lower the system performance. This paper compares the sound source localization algorithms based on time delay of arrival, which are robust to reverberation and noises considering microphone sensor distance. As well, post filter which outputs maximum count time delay is adopted to increase the accuracy.

  • PDF

모노폰 거리를 이용한 트라이폰 클러스터링 방법 연구 (Efficient Triphone Clustering Using Monophone Distance)

  • 방규섭;육동석
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.41-44
    • /
    • 2006
  • The purpose of state tying is to reduce the number of models and to use relatively reliable output probability distributions. There are two approaches: one is top down clustering and the other is bottom up clustering. For seen data, the performance of bottom up approach is better than that of top down approach. In this paper, we propose a new clustering technique that can enhance the undertrained triphone clustering performance. The basic idea is to tie unreliable triphones before clustering. An unreliable triphone is the one that appears in the training data too infrequently to train the model accurately. We propose to use monophone distance to preprocess these unreliable triphones. It has been shown in a pilot experiment that the proposed method reduces the error rate significantly.

  • PDF

화자적응과 군집화를 이용한 화자식별 시스템의 성능 및 속도 향상 (Adaptation and Clustering Method for Speaker Identification with Small Training Data)

  • 김세현;오영환
    • 대한음성학회지:말소리
    • /
    • 제58호
    • /
    • pp.83-99
    • /
    • 2006
  • One key factor that hinders the widespread deployment of speaker identification technologies is the requirement of long enrollment utterances to guarantee low error rate during identification. To gain user acceptance of speaker identification technologies, adaptation algorithms that can enroll speakers with short utterances are highly essential. To this end, this paper applies MLLR speaker adaptation for speaker enrollment and compares its performance against other speaker modeling techniques: GMMs and HMM. Also, to speed up the computational procedure of identification, we apply speaker clustering method which uses principal component analysis (PCA) and weighted Euclidean distance as distance measurement. Experimental results show that MLLR adapted modeling method is most effective for short enrollment utterances and that the GMMs performs better when long utterances are available.

  • PDF

대구 지역어의 세대 간 단모음 포먼트 비교 연구 - 어두 모음을 대상으로 - (A Comparative Study on the Vowel Formants between Generations in Daegu dialect - In the case of word-initial vowels -)

  • 장혜진;신지영
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 추계 학술대회 발표논문집
    • /
    • pp.97-100
    • /
    • 2005
  • The aim of the present study is to compare the vowel formants between generations in Daegu dialect. 20 Daegu dialect speakers were participated in this study; 10 were in their 40's, the other 10were in their 20's. As a result, the distance of /ㅣ/ and /ㅐ/, and, /ㅡ/ and /ㅓ/ in 20's is further than 40's, while the distance of /ㅗ/ and in 20's is closer than 40's. It seems reasonable to conclude that vowels in Daegu dialect change to have their own stable space, but /ㅗ/ and /ㅜ/ does not.

  • PDF

한국인 구치열에서 치간유두 존재와 치아접촉점과 치간골 거리와의 관계 (Relationship between Interdental Papilla Existence & Distance from Interdental Alveolar Crest to Contact Point in the Posterior Dentition of Korean adults)

  • 김현철;전용선;장문택;김형섭;박정미
    • Journal of Periodontal and Implant Science
    • /
    • 제31권3호
    • /
    • pp.625-631
    • /
    • 2001
  • The anatomic structure around interproximal area plays an important role not only in the natural teeth, but also in the implant. The loss of papilla can lead to cosmetic deformity, phonetic problem, food impaction on the anterior dentition, and masticatory problem, food impaction and proximal caries on the posterior dentition. The purpose of this study was to evaluate the relationship between interdental papilla existence and distance from contact point to alveolar crest in Korean posteior dentition. 45 Korean adult patients(31males, 14 females) participated in this study. Measurements were carreid out total 126 interproximal areas, 18 first premolar, 31 second premolar, 40 first molar, and 37 second molar areas. Papilla index was recorded as suggested by Jemt. Distance between contact point and alveolar crest measrued by Florida $probe^{R}$, after flap elevation. Each distance was measured 10 times by every 0.1mm unit. The results showed that the mean Papilla index 1.37 and mean distance between contact point and alveolar crest was 7.44mm. The correlation between the Papilla index and distance was high negative correlation(Pearson correlation=-0.47), and it was statistically significant(P=0.000) When the distance between contact point and alveolar crest was 5mm, the loss of papilla was appeared almost in half cases. When the distance was 6mm, the papilla loss was present 95%, when 7mm, the papilla loss was 100%.

  • PDF

입말 표기를 이용한 영어 단어 검색 (Retrieving English Words with a Spoken Work Transliteration)

  • 김지승;김광현;이준호
    • 한국문헌정보학회지
    • /
    • 제39권3호
    • /
    • pp.93-103
    • /
    • 2005
  • 영어 사전 검색 서비스 이용자들은 원하는 영어 단어의 철자를 정확하게 기억하지 못하고, 발음만을 기억하는 경우가 있다. 이러한 이용자들에게 도움을 주기 위해 본 연구에서는 입말 표기, 즉 영어 단어 발음의 한글 표기를 이용하여 영어 단어를 효과적으로 검색할 수 있는 방법을 제안한다. 이를 위하여 코닉스(KONIX) 코드를 개발하며, 입말 표기와 영어 단어를 코닉스 코드들로 변환한다. 그리고 변환된 코닉스 코드들 사이의 음성적 유사도를 편집 거리 방법과 2-그램 방법을 이용하여 계산한다. 또한 제안한 방법이 입말 표기에 의한 영어 단어 검색에 매우 효과적임을 실험을 통하여 입증한다.

다양한 신뢰도 척도를 이용한 SVM 기반 발화검증 연구 (SVM-based Utterance Verification Using Various Confidence Measures)

  • 권석봉;김회린;강점자;구명완;류창선
    • 대한음성학회지:말소리
    • /
    • 제60호
    • /
    • pp.165-180
    • /
    • 2006
  • In this paper, we present several confidence measures (CM) for speech recognition systems to evaluate the reliability of recognition results. We propose heuristic CMs such as mean log-likelihood score, N-best word log-likelihood ratio, likelihood sequence fluctuation and likelihood ratio testing(LRT)-based CMs using several types of anti-models. Furthermore, we propose new algorithms to add weighting terms on phone-level log-likelihood ratio to merge word-level log-likelihood ratios. These weighting terms are computed from the distance between acoustic models and knowledge-based phoneme classifications. LRT-based CMs show better performance than heuristic CMs excessively, and LRT-based CMs using phonetic information show that the relative reduction in equal error rate ranges between $8{\sim}13%$ compared to the baseline LRT-based CMs. We use the support vector machine to fuse several CMs and improve the performance of utterance verification. From our experiments, we know that selection of CMs with low correlation is more effective than CMs with high correlation.

  • PDF