• Title/Summary/Keyword: Inter-speaker variation

Search Result 8, Processing Time 0.025 seconds

Speaker-specific Implementation of VOT Values in Korean

  • Han, Jeong-Im;Kim, Joo-Yeon
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.7-18
    • /
    • 2008
  • The purpose of the present study is to test whether VOT values of the Korean plain stops in intervocalic position are encoded differently by individual speakers. In Scobbie (2006), the VOT values to the /p/-/b/ voicing contrast in Shetland Isles English were found to demonstrate a high degree of inter-speaker variation. More importantly such variation was not arbitrary: first, there was an inverse relationship between the amount of prevoicing for /b/ and the duration of aspiration for /p/. Second, the inter-speaker variation was shown to be similar between the subjects and their parents. These results suggest that the phonetic targets for VOT are specified in fine detail by speakers. The present study further explores this issue in terms of testing 1) whether the likelihood and the amount of voicing for the intervocalic plain stops in Korean show inter-speaker variation; 2) whether the likelihood and the exact amount of voicing for the intervocalic plain stops in Korean are closely related to the amount of aspiration for the Korean intervocalic aspirated stops. The results of the study suggest that the voicing of intervocalic plain stops in Korean varied according to the individual speakers, but it did not seem to be directly interrelated with the amount of aspiration of the aspirated stop sin the same phonological position.

  • PDF

A Study on the Mixed Model Approach and Symbol Probability Weighting Function for Maximization of Inter-Speaker Variation (화자간 변별력 최대화를 위한 혼합 모델 방식과 심볼 확률 가중함수에 관한 연구)

  • Chin Se-Hoon;Kang Chul-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.7
    • /
    • pp.410-415
    • /
    • 2005
  • Recently, most of the speaker verification systems are based on the pattern recognition approach method. And performance of the pattern-classifier depends on how to classify a variety of speakers' feature parameters. In order to classify feature parameters efficiently and effectively, it is of great importance to enlarge variations between speakers and effectively measure distances between feature parameters. Therefore, this paper would suggest the positively mixed model scheme that can enlarge inter-speaker variation by searching the individual model with world model at the same time. During decision procedure, we can maximize inter-speaker variation by using the proposed mixed model scheme. We also make use of a symbol probability weighting function in this system so as to reduce vector quantization errors by measuring symbol probability derived from the distance rate of between the world codebook and individual codebook. As the result of our experiment using this method, we could halve the Detection Cost Function (DCF) of the system from $2.37\%\;to\;1.16\%$.

An Amplitude Warping Approach to Intra-Speaker Normalization for Speech Recognition (음성인식에서 화자 내 정규화를 위한 진폭 변경 방법)

  • Kim Dong-Hyun;Hong Kwang-Seok
    • Journal of Internet Computing and Services
    • /
    • v.4 no.3
    • /
    • pp.9-14
    • /
    • 2003
  • The method of vocal tract normalization is a successful method for improving the accuracy of inter-speaker normalization. In this paper, we present an intra-speaker warping factor estimation based on pitch alteration utterance. The feature space distributions of untransformed speech from the pitch alteration utterance of intra-speaker would vary due to the acoustic differences of speech produced by glottis and vocal tract. The variation of utterance is two types: frequency and amplitude variation. The vocal tract normalization is frequency normalization among inter-speaker normalization methods. Therefore, we have to consider amplitude variation, and it may be possible to determine the amplitude warping factor by calculating the inverse ratio of input to reference pitch. k, the recognition results, the error rate is reduced from 0.4% to 2.3% for digit and word decoding.

  • PDF

Speaker Verification System Based on HMM Robust to Noise Environments (잡음환경에 강인한 HMM기반 화자 확인 시스템에 관한 연구)

  • 위진우;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.69-75
    • /
    • 2001
  • Intra-speaker variation, noise environments, and mismatch between training and test conditions are the major reasons for the speaker verification system unable to use it practically. In this study, we propose robust end-point detection algorithm, noise cancelling with the microphone property compensation technique, and inter-speaker discriminate technique by weighting cepstrum for robust speaker verification system. Simulation results show that the average speaker verification rate is improved in the rate of 17.65% with proposed end-point detection algorithm using LPC residue and is improved in the rate of 36.93% with proposed noise cancelling and microphone property compensation algorithm. The proposed weighting function for discriminating inter-speaker variations also improves the average speaker verification rate in the rate of 6.515%.

  • PDF

Speaker Verification Performance Improvement Using Weighted Residual Cepstrum (가중된 예측 오차 파라미터를 사용한 화자 확인 성능 개선)

  • 위진우;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.5
    • /
    • pp.48-53
    • /
    • 2001
  • In speaker verification based on LPC analysis the prediction residues are ignored and LPCC(LPC cepstrum) are only used to compose feature vectors. In this study, LPCC and RCEP (residual cepstrum) extracted from residues are used as feature parameters in the various environmental speaker verification. We propose the weighting function which can enlarge inter-speaker variation by weighting pitch, speaker inherent vector, included in residual cepstrum. Simulation results show that the average speaker verification rate is improved in the rate of 6% with RCEP and LPCC at the same time and is improved in the rate of 2.45% with the proposed weighted RCEP and LPCC at the same time compared with no weighting.

  • PDF

Statistical Patterns in Consonant Cluster Simplification in Seoul Korean: Within-dialect Interspeaker and Intraspeaker Variation

  • Cho, Tae-Hong;Kim, Sa-Hyang
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.33-40
    • /
    • 2009
  • This study examines how young speakers of Seoul Korean produce tri-consonantal clusters /1kt/ and /1pt/ as in palk-ta ('to be bright') and palp-ta ('to step on'). Production data were collected from 20 speakers of Seoul Korean. The results of narrow transcription of the data showed that simplification is not obligatory as some speakers often preserve all three consonants. When simplified, there was a clear asymmetry between /1kt/ and /1pt/. Speakers showed no clear preference for either C1 preservation (C1=/1/) or C2 preservation (C2=/k/ in /1kt/ and /p/ in /1pt/) in production of /1kt/, but in production of /1pt/, strong preference was found for C1-preserved to C2-preserved variant. When compared with production data in Cho (1999), simplification patterns appear to have changed over the past 10 years, in a direction to preserve the first member of the cluster (/1/) more often, especially with /1kt/. There was no substantial between-item variation, indicating that simplification patterns are not lexically specified. Finally, the results suggest that the process of tri-consonantal simplification has not been fully phonologized in the grammar of the language as evident in substantial inter- and intra-speaker variation.

  • PDF

An acoustic study on the duration of the morn in Japanese (일본어 특수박의 지속시간에 관한 음향음성학적 분석)

  • Kim Seonhi
    • MALSORI
    • /
    • no.38
    • /
    • pp.113-124
    • /
    • 1999
  • It is well known that Japanese prosodic structure assumes mora below the syllable tier. Syllables with V or CV structure are counted as having one morn whereas those with coda consonants /-pp, -tt, -kk, -ss, -N/ or long vowels are counted as having two morns in Japanese. This study measured the acoustic duration of these special moras ('tokusyuhaku') produced by Tokyo dialect speakers to see if they are isochronic with V or CV. It also examined the production of Korean(Seoul/Kyungsang dialect) and Chinese native speakers loaming Japanese as a second language to examine how the learners' first language influence their second language. Finally, it examined how speakers of the Akita dialect, which is blown as a syllabeme dialect in Japanese, produced them. The results showed that intra-speaker variation as well as inter-speaker variation was observed in the production by Akita dialect speakers. Production of native speakers of Chinese and Kyungsang dialect of Korean -- which have vowel length contrast in their phonological systems -- showed a similar result to Tokyo dialect speakers, which implies the influence of the learners' first language on the acquisition of the second language.

  • PDF

Tonal development and voice quality in the stops of Seoul Korean

  • Yu, Hye Jeong
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.91-99
    • /
    • 2018
  • Korean stops are currently undergoing a tonogenetic sound change, as found in the Seoul dialect in which a merged VOT of aspirated and lax stops induces F0 to be the primary cue for distinguishing the two stops and the lax stops have lower F0 than the aspirated stops. In tonal languages, low tone is produced with a breathy voice. This study investigated whether there are changes in voice quality with respect to the tonogenetic sound change of Korean stops. Two age groups speaking the Seoul dialect participated in this study: five females and six males born in the 1940s and 1950s and nine females and eight males born in the 1980s and 1990s. This study replicated previous findings of VOT and F0 and further examined H1-H2, H1-A1, and H1-A2 to see how they correlate with the sound change. In the older and younger generations, H1-H2, H1-A1, and H1-A2 were significantly lower after the tense stops than after the aspirated and lax stops, but they were not significantly different after the aspirated and lax stops. However, the younger females exhibited some different results for H1-H2 and H1-A2 than the older generation. In the younger females, the H1-H2 mean was higher after the aspirated stops than it was after the lax stops at the vowel onset, and the H1-H2 difference increased at the vowel midpoint. Although there was an inter-speaker variation in the results of H1-H2 and H1-A1, analyses of individual speakers showed that the H1-H2 and H1-A1 were higher after the lax stops than after the aspirated stops in the younger female speakers. These results indicate that lax stops tend to be breathier than aspirated stops in the younger female speakers. They also indicate that changes in voice quality are on Korean stops with tonal sound change, but are still developing.