• Title/Summary/Keyword: Acoustic cues

Search Result 68, Processing Time 0.022 seconds

The identification of Korean vowels /o/ and /u/ by native English speakers

  • Oh, Eunhae
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.19-24
    • /
    • 2016
  • The Korean high back vowels /o/ and /u/ have been reported to be in a state of near-merger especially among young female speakers. Along with cross-generational changes, the vowel position within a word has been reported to render different phonetic realization. The current study examines native English speakers' ability to attend to the phonetic cues that distinguish the two merging vowels and the positional effects (word-initial vs. word-final) on the identification accuracy. 28 two-syllable words containing /o/ or /u/ in either initial or final position were produced by native female Korean speakers. The CV part of each target word were excised and presented to six native English speakers. The results showed that although the identification accuracy was the lowest for /o/ in word- final position (41%), it increased up to 80% in word-initial position. The acoustic analyses of the target vowels showed that /o/ and /u/ were differentiated on the height dimension only in word-initial position, suggesting that English speakers may have perceived the distinctive F1 difference retained in the prominent position.

Segmental effects on Prosodic Domain -initial Strengthening

  • Oh, Mi-Ra
    • Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.13-23
    • /
    • 2002
  • This study examines the effect of laryngeal consonants of Korean on prosodic domain-initial strengthening. Keating, Cho, Fougeron & Hsu (1999), Fougeron & Keating (1996), and Hsu & Jun (1998) found that consonants at the beginnings of larger phrases are more constricted than consonants at the beginnings of smaller phrases. Korean laryngeal consonants pose a counter-example to the general pattern of domain-initial strengthening since tense and aspirated consonants are longer word-medially than word-initially. Previous work on domain-initial strengthening focused on domain-initial consonants at different prosodic domains. This study shows that acoustic cues that are not domain-edge also function to demarcate prosodic structure when the domain-initial consonant is laryngeal: VOT for an aspirated consonant and duration of V2 for a tense consonant.

  • PDF

Perceptual cues for /o/ and /u/ in Seoul Korean (서울말 /?/와 /?/의 지각특성)

  • Byun, Hi-Gyung
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.1-14
    • /
    • 2020
  • Previous studies have confirmed that /o/ and /u/ in Seoul Korean are undergoing a merger in the F1/F2 space, especially for female speakers. As a substitute parameter for formants, it is reported that female speakers use phonation (H1-H2) differences to distinguish /o/ from /u/. This study aimed to explore whether H1-H2 values are being used as perceptual cues for /o/-/u/. A perception test was conducted with 35 college students using /o/ and /u/ spoken by 41 females, which overlap considerably in the vowel space. An acoustic analysis of 182 stimuli was also conducted to see if there is any correspondence between production and perception. The identification rate was 89% on average, 86% for /o/, and 91% for /u/. The results confirmed that when /o/ and /u/ cannot be distinguished in the F1/F2 space because they are too close, H1-H2 differences contribute significantly to the separation of the two vowels. However, in perception, this was not the case. H1-H2 values were not significantly involved in the identification process, and the formants (especially F2) were still dominant cues. The study also showed that even though H1-H2 differences are apparent in females' production, males do not use H1-H2 in their production, and both females and males do not use H1-H2 in their perception. It is presumed that H1-H2 has not yet been developed as a perceptual cue for /o/ and /u/.

Confusion in the Perception of English Labial Consonants by Korean Learners (한국 학습자들의 영어 순자음 혼동)

  • Cho, Mi-Hui
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.1
    • /
    • pp.455-464
    • /
    • 2009
  • Based on the observation that Korean speakers of English have difficulties in producing English fricatives, a perception experiment was designed to investigate whether Korean speakers also have difficulties perceiving English labial consonants including fricatives. Forty Korean college students were asked to perform a multiple-choice identification test. The consonant perception test consisted of nonce words which contained English labial consonants [p, b, f, v] in 4 different prosodic locations: initial onset position, intervocalic position before stress, intervocalic position after stress, and final coda position. The general perception pattern was that the mean accuracy rates were higher in strong position like CV and VCVV than in weak position like VC and VVCV. The difficulties in perceiving the English targets resulted mainly from bidirectional manner confusion between stop and fricative across all prosodic locations. The other types of misidentification were due to place confusion as well as voicing confusion. Place confusion was generated mostly by the target [f] in all prosodic position due to acoustic properties. Voicing confusion was heavily influenced by prosodic position. The misperception of the participants was accounted for by phonetic properties and/or the participants' native language properties.

Improvement of 3D Sound Using Psychoacoustic Characteristics (인간의 청각 특성을 이용한 입체음향의 방향감 개선)

  • Koo, Kyo-Sik;Cha, Hyung-Tai
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.5
    • /
    • pp.255-264
    • /
    • 2011
  • The Head Related Transfer Function (HRTF) means a process related to acoustic transmission from 3d space to the listener's ear. In other words, it contains the information that human can perceive locations of sound sources. So, we make virtual 3d sound using HRTF, despite it doesn't actually exist. But, it can deteriorate some three-dimensional effect by the confusion between front and back directions due to the non-individual HRTF depending on each listener. In this paper, we proposed the new algorithm to reduce the confusion of sound image localization using human's acoustic characteristics. The frequency spectrum and global masking threshold of 3d sounds using HRTF are used to calculate the psychoacoustical differences among each directions. And perceptible cues in each critical band are boosted to create effective 3d sound. As a result, we can make the improved 3d sound, and the performances are much better than conventional methods.

A STUDY ON THE IMPLEMENTATION OF ARTIFICIAL NEURAL NET MODELS WITH FEATURE SET INPUT FOR RECOGNITION OF KOREAN PLOSIVE CONSONANTS (한국어 파열음 인식을 위한 피쳐 셉 입력 인공 신경망 모델에 관한 연구)

  • Kim, Ki-Seok;Kim, In-Bum;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1990.07a
    • /
    • pp.535-538
    • /
    • 1990
  • The main problem in speech recognition is the enormous variability in acoustic signals due to complex but predictable contextual effects. Especially in plosive consonants it is very difficult to find invariant cue due to various contextual effects, but humans use these contextual effects as helpful information in plosive consonant recognition. In this paper we experimented on three artificial neural net models for the recognition of plosive consonants. Neural Net Model I used "Multi-layer Perceptron ". Model II used a variation of the "Self-organizing Feature Map Model". And Model III used "Interactive and Competitive Model" to experiment contextual effects. The recognition experiment was performed on 9 Korean plosive consonants. We used VCV speech chains for the experiment on contextual effects. The speech chain consists of Korean plosive consonants /g, d, b, K, T, P, k, t, p/ (/ㄱ, ㄷ, ㅂ, ㄲ, ㄸ, ㅃ, ㅋ, ㅌ, ㅍ/) and eight Korean monothongs. The inputs to Neural Net Models were several temporal cues - duration of the silence, transition and vot -, and the extent of the VC formant transitions to the presence of voicing energy during closure, burst intensity, presence of asperation, amount of low frequency energy present at voicing onset, and CV formant transition extent from the acoustic signals. Model I showed about 55 - 67 %, Model II showed about 60%, and Model III showed about 67% recognition rate.

  • PDF

Classification of nasal places of articulation based on the spectra of adjacent vowels (모음 스펙트럼에 기반한 전후 비자음 조음위치 판별)

  • Jihyeon Yun;Cheoljae Seong
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.25-34
    • /
    • 2023
  • This study examined the utility of the acoustic features of vowels as cues for the place of articulation of Korean nasal consonants. In the acoustic analysis, spectral and temporal parameters were measured at the 25%, 50%, and 75% time points in the vowels neighboring nasal consonants in samples extracted from a spontaneous Korean speech corpus. Using these measurements, linear discriminant analyses were performed and classification accuracies for the nasal place of articulation were estimated. The analyses were applied separately for vowels following and preceding a nasal consonant to compare the effects of progressive and regressive coarticulation in terms of place of articulation. The classification accuracies ranged between approximately 50% and 60%, implying that acoustic measurements of vowel intervals alone are not sufficient to predict or classify the place of articulation of adjacent nasal consonants. However, given that these results were obtained for measurements at the temporal midpoint of vowels, where they are expected to be the least influenced by coarticulation, the present results also suggest the potential of utilizing acoustic measurements of vowels to improve the recognition accuracy of nasal place. Moreover, the classification accuracy for nasal place was higher for vowels preceding the nasal sounds, suggesting the possibility of higher anticipatory coarticulation reflecting the nasal place.

Perception of the English Epenthetic Stops by Korean Listeners

  • Han, Jeong-Im
    • Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.87-103
    • /
    • 2004
  • This study investigates Korean listeners' perception of the English stop epenthesis between the sonorant and fricative segments. Specifically this study investigates 1) how often English epenthetic stops are perceived by native Korean listeners, given the fact that Korean does not allow consonant clusters in codas; and 2) whether perception of the epenthetic stops, which are optional phonetic variations, not phonemes, could be improved without any explicit training. 120 English non-words with a mono-syllable structure of CVC1C2, where C1=/m, n, $\eta$, 1/, and C2=/s, $\theta$, $\int$/, were given to two groups of native Korean listeners, and they were asked to detect the target stops such as [p], [t], and [k]. The number of their responses were computed to determine how often listeners succeed in recovering the string of segments produced by the native English speaker. The results of the present study show that English epenthetic stops are poorly identified by native Korean listeners with low English proficiency, even in the case where stimuli with strong acoustic cues are provided with, but perception of epenthetic stops is closely related with listeners' English proficiency, showing the possibility of the improvement of perception. It further shows that perception of epenthetic stops shows asymmetry between coronal and non-coronal consonants.

  • PDF

Word class information in perception of prosodic prominence by Korean learners of English

  • Im, Suyeon
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.1-8
    • /
    • 2019
  • This study aims to investigate how prosodic prominence is perceived in relation to word class information (or parts-of-speech) by Korean learners of English compared with native English speakers in public speech. Two groups, Korean learners of English and native English speakers, were asked to judge words perceived as prominent simultaneously while listening to a speech. Parts-of-speech and three acoustic cues (i.e., max F0, mean phone duration, and mean intensity) were analyzed for each word in the speech. The results showed that content words tended to be higher in pitch and longer in duration than function words. Both groups of listeners rated prominence on content words more frequently than on function words. This tendency, however, was significantly greater for Korean learners of English than for native English speakers. Among the parts-of-speech of the content words, Korean learners of English were more likely than native English speakers to judge nouns and verbs as prominent. This study presents evidence that Korean learners of English consider most, if not all, content words as landing locations of prosodic prominence, in alignment with the previous study on the production of prominence.

Individual differences in categorical perception: L1 English learners' L2 perception of Korean stops

  • Kong, Eun Jong
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.63-70
    • /
    • 2019
  • This study investigated individual variability of L2 learners' categorical judgments of L2 stops by exploring English learners' perceptual processing of two acoustic cues (voice onset time [VOT] and f0) and working memory capacity as sources of variation. As prior research has reported that English speakers' greater use of the redundant cue f0 was responsible for gradient processing of native stops, we examined whether the same processing characteristics would be observed in L2 learners' perception of Korean stops (/t/-/th/). 22 English learners of L2 Korean with a range of L2 proficiency participated in a visual analogue scaling task and demonstrated variable manners of judging the L2 Korean stops: Some were more gradient than others in performing the task. Correlation analysis revealed that L2 learners' categorical responses were modestly related to individuals' utilizations of a primary cue for the stop contrast (VOT for L1 English stops and f0 for L2 Korean stops), and were also related to better working memory capacity. Together, the current experimental evidence demonstrates adult L2 learners' top-down processing of stop consonants where linguistic and cognitive resources are devoted to a process of determining abstract phonemic identity.