• Title/Summary/Keyword: speech production variations

Search Result 15, Processing Time 0.017 seconds

How Korean Learner's English Proficiency Level Affects English Speech Production Variations

  • Hong, Hye-Jin;Kim, Sun-Hee;Chung, Min-Hwa
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.115-121
    • /
    • 2011
  • This paper examines how L2 speech production varies according to learner's L2 proficiency level. L2 speech production variations are analyzed by quantitative measures at word and phone levels using Korean learners' English corpus. Word-level variations are analyzed using correctness to explain how speech realizations are different from the canonical forms, while accuracy is used for analysis at phone level to reflect phone insertions and deletions together with substitutions. The results show that speech production of learners with different L2 proficiency levels are considerably different in terms of performance and individual realizations at word and phone levels. These results confirm that speech production of non-native speakers varies according to their L2 proficiency levels, even though they share the same L1 background. Furthermore, they will contribute to improve non-native speech recognition performance of ASR-based English language educational system for Korean learners of English.

  • PDF

Variations in the perception of lexical pitch accents and the correlations with individuals' autistic traits

  • Lee, Hyunjung
    • Phonetics and Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.53-59
    • /
    • 2017
  • The present study examined if individual listeners' perceptual variations were associated with their cognitive characteristics indexed by the Autistic Spectrum Quotient (AQ). This study first investigated the perception of the lexical pitch accent contrast in the Kyungsang Korean currently undergoing a sound change, and then tested if listeners' perceptual variations were correlated with their AQ scores. Eighteen Kyungsang listeners in their 20s participated in the perception experiment where they identified two contrastive accent words for auditory stimuli systematically varying F0 scaling and timing properties; the participants then completed the AQ questionnaire. In the results, the acoustic parameters reporting reduced phonetic differences across accent contrasts for younger Kyungsang generation played a reliable role in perceiving the HH word from HL, suggesting the discrepancy between the perception and the production in the context of sound change. This study also observed that individuals' perceptual variations were negatively correlated with their AQ sub scores. The present findings suggested that the sound change might appear differently between production and perception with a different time course, and deviant percepts could be explained by individuals' cognitive measure.

Acoustic Measurement of English read speech by native and nonnative speakers

  • Choi, Han-Sook
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.77-88
    • /
    • 2011
  • Foreign accent in second language production depends heavily on the transfer of features from the first language. This study examines acoustic variations in segments and suprasegments by native and nonnative speakers of English, searching for patterns of the transfer and plausible indexes of foreign accent in English. The acoustic variations are analyzed with recorded read speech by 20 native English speakers and 50 Korean learners of English, in terms of vowel formants, vowel duration, and syllabic variation induced by stress. The results show that the acoustic measurements of vowel formants and vowel and syllable durations display difference between native speakers and nonnative speakers. The difference is robust in the production of lax vowels, diphthongs, and stressed syllables, namely the English-specific features. L1 transfer on L2 specification is found both at the segmental levels and at the suprasegmental levels. The transfer levels measured as groups and individuals further show a continuum of divergence from the native-like target. Overall, the eldest group, students who are in the graduate schools, shows more native-like patterns, suggesting weaker foreign accent in English, whereas the high school students tend to involve larger deviation from the native speakers' patterns. Individual results show interdependence between segmental transfer and prosodic transfer, and correlation with self-reported proficiency levels. Additionally, experience factors in English such as length of English study and length of residence in English speaking countries are further discussed as factors to explain the acoustic variation.

  • PDF

Word-final Coda Acquisition by English-Speaking Childrea with Cochlear Implants

  • Kim, Jung-Sun
    • Phonetics and Speech Sciences
    • /
    • v.3 no.4
    • /
    • pp.23-31
    • /
    • 2011
  • This paper examines the production patterns of the acquisition of coda consonants in monosyllabic words in English-speaking children with cochlear implants. The data come from the transcribed speech of children with cochlear implants. This study poses three questions. First, do children with cochlear implants acquire onset consonants earlier than codas? Second, do children's productions have a bimoraic-sized constraint that maintains binary feet? Third, what patterns emerge from production of coda consonants? The results revealed that children with cochlear implants acquire onset consonants earlier than codas. With regard to the bimoraic-sized constraints, the productions of vowel type (i.e., monomoraic and bimoraic) were more accurate for monomoraic vowels than bimoraic ones for some children with cochlear implants, although accuracy in vowel productions showed high proportion regardless of vowel types. The variations of coda production exhibited individual differences. Some children produced less sonorant consonants with high frequency and others produced more sonorant ones. The results of this study were similar to those pertaining to children with normal hearing. In the process of coda consonant acquisition, the error patterns of prosody-sensitive production may be regarded as articulatory challenges to produce higher-level prosodic structures.

  • PDF

A Korean Speech Recognition Using Fuzzy Rule Base (Fuzzy Rule Base를 이용한 한국어 연속 음성인식)

  • Song, Jeong-Young
    • The Journal of Engineering Research
    • /
    • v.2 no.1
    • /
    • pp.13-21
    • /
    • 1997
  • This paper describes how to represent varations of feature parameters to improve recognition of continuous speech. For speech recognition, feature parameters, which are formant frequencies, pitches, logarithmic energies and zero crossing retes are used in general. But, their values and variations depend on speakers, for example disparities between man and woman, and on their age. It is difficult to decide a priority the value of the variation width. Hence, we try to represent this variation by introducing fuzziness and recognize a continuous speech by fuzzy inference using fuzzy production rules.

  • PDF

Speaker Variation in Number Production by Males (남성의 숫자음 발성에 나타난 화자변이)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.93-104
    • /
    • 2001
  • The author analyzed acoustic parameters of ten Korean numbers produced by ten male students using Praat. Variations of f0, F1, F2 and F3 within and between speakers were examined by determining an average and standard deviation of the parameters of each number and by comparing the acoustic values with one another. Results showed that each subject produced the numbers within a certain range of variation across time. Thus, speaker identification can be more certain using dynamic information of the acoustic parameters within each vocalic segment. Also, percent difference of within-subjects' variation to that of between-subjects can be utilized to determine which sounds would be better stimuli for speaker identification. According to the criteria, the number '2' proved the best stimulus while the number '7' was the worst. Future studies will be necessary to explore robust methods of speaker identification under noisy conditions.

  • PDF

L1-L2 Transfer in VOT and f0 Production by Korean English Learners: L1 Sound Change and L2 Stop Production

  • Kim, Mi-Ryoung
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.31-41
    • /
    • 2012
  • Recent studies have shown that the stop system of Korean is undergoing a sound change in terms of the two acoustic parameters, voice onset time (VOT) and fundamental frequency (f0). Because of a VOT merger of a consonantal opposition and onset-f0 interaction, the relative importance of the two parameters has been changing in Korean where f0 is a primary cue and VOT is a secondary cue in distinguishing lax from aspirated stops in speech production as well as perception. In English, however, VOT is a primary cue and f0 is a secondary cue in contrasting voiced and voiceless stops. This study examines how Korean English learners use the two acoustic parameters of L1 in producing L2 English stops and whether the sound change of acoustic parameters in L1 affects L2 speech production. The data were collected from six adult Korean English learners. Results show that Korean English learners use not only VOT but also f0 to contrast L2 voiced and voiceless stops. However, unlike VOT variations among speakers, the magnitude effect of onset consonants on f0 in L2 English was steady and robust, indicating that f0 also plays an important role in contrasting the [voice] contrast in L2 English. The results suggest that the important role of f0 in contrasting lax and aspirated stops in L1 Korean is transferred to the contrast of voiced and voiceless stops in L2 English. The results imply that, for Korean English learners, f0 rather than VOT will play an important perceptual cue in contrasting voiced and voiceless stops in L2 English.

Perceptual Experiment on Number Production for Speaker Identification

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.7-19
    • /
    • 2001
  • The acoustic parameters of nine Korean numbers were analyzed by Praat, a speech analysis software, and synthesized by SenSynPPC, a Klatt formant synthesizer. The overall intensity, pitch and formant values of the numbers were modified dynamically by a step of 1 dB, 1 Hz and 2.5% respectively. The study explored the sensitivity of listeners to changes in the three acoustic parameters. Twelve subjects (male and female) listened to 390 pairs of synthesized numbers and judged whether the given pair sounded the same or different. Results showed that subjects perceived the same sound quality within the range of 6.6 dB of intensity variation, 10.5 Hz of pitch variation and 5.9% of the first three formant variations. The male and female groups showed almost the same perceptual ranges. Also, an asymmetrical structure of high and low boundary was observed. The ranges may be applicable to the development of a speaker identification system while the method of synthesis modification may apply to its evaluation data.

  • PDF

Korean /l/-flapping in an /i/-/i/ context

  • Son, Minjung
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.151-163
    • /
    • 2015
  • In this study, we aim to describe kinematic characteristics of Korean /l/-flapping in two speech rates (fast vs. comfortable). Production data was collected from seven native speakers of Seoul Korean (four females and three males) using electromagnetic midsagittal articulometry (EMMA), which provided two dimensional data on the x-y plane. We examined kinematic properties of the vertical/horizontal tongue tip gesture, the vertical/horizontal (rear) tongue body gesture, and the jaw gesture in an /i/-/i/ context. Gestural landmarks of the vertical tongue tip gesture are directly measured. This serves as the actual anchoring time points to which relevant measures of other trajectories referred. The study focuses on velocity profiles, closing/opening spatiotemporal properties, constriction duration, and constriction minima were analyzed. The results are summarized as follows. First, gradiently distributed spatiotemporal values of the vertical tongue tip gesture were on a continuum. This shows more of a reduction in fast speech rate, but no single instance of categorical reduction (deletion). Second, Korean /l/-flapping predominantly exhibited a backward sliding tongue tip movement, in 83% of production, which is apparently distinguished from forward sliding movement in English. Lastly, there was an indication of vocalic reduction in fast rate, truncating spatial displacement of the jaw and the tongue body, although we did not observe positional variations with speech rate. The present study shows that Korean /l/-flapping is characterized by mixed articulatory properties with respect to flapping sounds of other languages such as English and Xiangxiang Chinese. Korean /l/ flapping demonstrates a language-universal property, such as the gradient nature of its flapping sounds that is compatible with other languages. On the other hand, Korean /l/-flapping also shows a language-particular property, particularly distinguished from English, in that a backward gliding movement occurs during the tongue tip closing movement. Although, there was no vocalic reduction in V2 observed in terms of jaw and tongue body height, spatial displacement of these articulators still suggests truncation in fast speech rate.

Automatic Speech Recognition Research at Fujitsu (후지쯔에 있어서의 음성 자동인식의 현상과 장래)

  • Nara, Yasuhiro;Kimura, Shinta;Loken-Kim, K.H.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.1
    • /
    • pp.82-91
    • /
    • 1991
  • The history of automatic speech recognition research, and current and future speech products at Fujitsu are introduced here. The speech recognition research at Fujitsu started in 1970. Our research efforts have results in the production of a speaker dependent 12,000 word discrete / connected word recognizer(F2360), and a speaker independent 17 word discrete word recognizer(F2355L/S). Currently, we are working on a larger vocabulary speech recognizer, in which an input utterance will be matched with networks representing possible phonemic variations. Its application to text input is also discussed.

  • PDF