• Title/Summary/Keyword: L2 speech perception

Search Result 42, Processing Time 0.022 seconds

A 3D Audio-Visual Animated Agent for Expressive Conversational Question Answering

  • Martin, J.C.;Jacquemin, C.;Pointal, L.;Katz, B.
    • 한국정보컨버전스학회:학술대회논문집
    • /
    • 2008.06a
    • /
    • pp.53-56
    • /
    • 2008
  • This paper reports on the ACQA(Animated agent for Conversational Question Answering) project conducted at LIMSI. The aim is to design an expressive animated conversational agent(ACA) for conducting research along two main lines: 1/ perceptual experiments(eg perception of expressivity and 3D movements in both audio and visual channels): 2/ design of human-computer interfaces requiring head models at different resolutions and the integration of the talking head in virtual scenes. The target application of this expressive ACA is a real-time question and answer speech based system developed at LIMSI(RITEL). The architecture of the system is based on distributed modules exchanging messages through a network protocol. The main components of the system are: RITEL a question and answer system searching raw text, which is able to produce a text(the answer) and attitudinal information; this attitudinal information is then processed for delivering expressive tags; the text is converted into phoneme, viseme, and prosodic descriptions. Audio speech is generated by the LIMSI selection-concatenation text-to-speech engine. Visual speech is using MPEG4 keypoint-based animation, and is rendered in real-time by Virtual Choreographer (VirChor), a GPU-based 3D engine. Finally, visual and audio speech is played in a 3D audio and visual scene. The project also puts a lot of effort for realistic visual and audio 3D rendering. A new model of phoneme-dependant human radiation patterns is included in the speech synthesis system, so that the ACA can move in the virtual scene with realistic 3D visual and audio rendering.

  • PDF

The identification of Korean vowels /o/ and /u/ by native English speakers

  • Oh, Eunhae
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.19-24
    • /
    • 2016
  • The Korean high back vowels /o/ and /u/ have been reported to be in a state of near-merger especially among young female speakers. Along with cross-generational changes, the vowel position within a word has been reported to render different phonetic realization. The current study examines native English speakers' ability to attend to the phonetic cues that distinguish the two merging vowels and the positional effects (word-initial vs. word-final) on the identification accuracy. 28 two-syllable words containing /o/ or /u/ in either initial or final position were produced by native female Korean speakers. The CV part of each target word were excised and presented to six native English speakers. The results showed that although the identification accuracy was the lowest for /o/ in word- final position (41%), it increased up to 80% in word-initial position. The acoustic analyses of the target vowels showed that /o/ and /u/ were differentiated on the height dimension only in word-initial position, suggesting that English speakers may have perceived the distinctive F1 difference retained in the prominent position.

Perception of Spanish $/{\setminus}/$ - /r/ distinction by native Japanese

  • Mignelina Guirao Jorge A. Gurlekian;Maria A. Garcia Jurado
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.337-342
    • /
    • 1996
  • In prevoius works we have repored phonetic similarities between Japanese and Spanish voweis and syiiabic sounds. (1) (2) (3) (4). In the present communication we explore the relative importance of duration of the consonantal segment to elicit Spanish /l/ - /r/ distinction by native j Japanese talkers. Three Argentine and three trained native Japanese talkers recorded /l-r/ combined with /a/ in VCV sequences. Modifications of consonant duration and vowel context with transitions were m made by editing natural /ala/ sounds. Mixed VCV were produced by combining sounds of both languages. Perceptual tests were produced by combining sounds of both languages perceptual performed presenting the speech material, to native t trained and non trained Japanese listeners. In a tirst sessIOn a d discrimination procedure was applied. The items were arranged in pairs a and listeners Nere told to indicate the pair that sounded different. In the f following session they were asked to identify and type the letter corresponding to each one of the items. Responses arc examined in tenns of critical duration of the interval between vowels. Preliminary results indicate that the duration of intervocalic intervais was a relevant cue for the identification of /l/ and /r/. It seems that to differentiate the two sounds, Japanese listeners required relatively longer interval steps than the argentine suhjects. There was a tendency to conhlse more frequently /l/ for /r/ than viceversa.

  • PDF

A study on the Suprasegmental Parameters Exerting an Effect on the Judgment of Goodness or Badness on Korean-spoken English (한국인 영어 발음의 좋음과 나쁨 인지 평가에 영향을 미치는 초분절 매개변수 연구)

  • Kang, Seok-Han;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.3 no.2
    • /
    • pp.3-10
    • /
    • 2011
  • This study investigates the role of suprasegmental features with respect to the intelligibility of Korean-spoken English judged by Korean and English raters as being good or bad. It has been hypothesized that Korean raters would have different evaluations from English native raters and that the effect may vary depending on the types of suprasegmental factors. Four Korean and four English native raters, respectively, took part in the evaluation of 14 Korean subjects' English speaking. The subjects read a given paragraph. The results show that the evaluation for 'intelligibility' is different for the two groups and that the difference comes from their perception of L2 English suprasegmentals.

  • PDF

Learning acoustic cue weights for Korean stops through L2 perception training (지각 훈련을 통한 한국어 폐쇄음 음향 신호 가중치의 L2 학습)

  • Oh, Eunjin
    • Phonetics and Speech Sciences
    • /
    • v.13 no.4
    • /
    • pp.9-21
    • /
    • 2021
  • This study investigated whether Korean learners improve acoustic cue weights to identify Korean lenis and aspirated stops in the direction of native values through perception training that focused on contrasting the stops in various phonetic contexts. Nineteen native Chinese learners of Korean and two native Korean instructors for the perception training participated in the experiment. A training group and a non-training group were divided according to pretest results, and only the training group participated in the training for 5 days. To estimate the perceptual weights of the stop cues, a pretest and a posttest were conducted with stimuli whose stop cues (F0 and VOT) were systematically manipulated. Binary logistic regression analyses were performed on each learner's test results to calculate perceptual β coefficients, which estimate the perceptual weights of the acoustic cues used in identifying the stop contrast. The training group showed a statistically significant increase of 0.451 on average in the posttest for the coefficient values of the F0, which is the primary cue for the stop contrast, whereas the non-training group showed an insignificant increase of 0.246. The patterns of change in the F0 use after training varied considerably among individual learners.

Perception and production of Korean and English stops by bilinguals with extensive experience residing in the U.S.: Individual patterns

  • Oh, Eunjin
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.33-40
    • /
    • 2017
  • This study aimed to examine how Korean-English bilinguals make use of VOT and F0 cues in perception and production of Korean (lenis vs. aspirated) and English (voiced vs. voiceless) stops. It was explored whether bilinguals with extensive experience living in the U.S. exhibit native-like or interactive patterns in the cue use for both languages. Participants produced monosyllabic word-initial stops within a carrier sentence in each language, and performed forced-choice identification tasks with synthesized stimuli varying in 7 VOT steps and 7 F0 steps with base tokens of /$t^han$/ for Korean and /$t{\ae}n$/ for English. Listeners were required to select either /tan/ or /$t^han$/ for Korean and either /$d{\ae}n$/ or /$t{\ae}n$/ for English. The results from binary logistic regression analyses for each listener indicated that all bilinguals placed greater weight on F0 than VOT when distinguishing between the Korean lenis and aspirated stops, and greater weight on VOT than F0 in distinguishing between the English voiced and voiceless stops. In terms of production, all participants showed remarkably overlapping ranges in the VOT dimension and separating ranges in the F0 dimension for the stop contrast of Korean, while forming overlapping ranges in the F0 dimension and separating ranges in the VOT dimension for the stop contrast of English. These results indicate that the bilinguals with extensive exposure to L2 manage the stop systems of the two languages independently, both in perception and production, employing the opposite cue use for stops in the two languages. It was also found that the absolute beta-coefficient values of the perceptual cues for Korean stops were generally smaller than those for English and those reported in a previous study as for later bilinguals, which may have resulted from Korean not being their dominant language.

V-to-C Coarticulation Effects in Non-native Speakers of English and Russian: A Locus-equation Analysis

  • Oh, Eun-Jin
    • MALSORI
    • /
    • no.63
    • /
    • pp.1-21
    • /
    • 2007
  • Locus equation scatterplots for [bilabial stop + vowel] syllables were obtained from 16 non-native speakers of English and Russian. The results indicated that both Russian speakers of English and English speakers of Russian exhibited modifications towards respective L2 norms in slopes and y-intercepts. All non-native locus equations generated exhibited linearity. Accordingly, the basic results reported in [17] were reverified by securing a larger subject base. More experienced speakers displayed better approximations to L2 norms than less experienced speakers, indicating the necessity of perception- and articulation-related learning for allophonic variations due to adjacent phonetic environments.

  • PDF

The Effect of Acoustic Correlates of Domain-initial Strengthening in Lexical Segmentation of English by Native Korean Listeners

  • Kim, Sa-Hyang;Cho, Tae-Hong
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.115-124
    • /
    • 2010
  • The current study investigated the role of acoustic correlates of domain-initial strengthening in lexical segmentation of a non-native language. In a series of cross-modal identity-priming experiments, native Korean listeners heard English auditory stimuli and made lexical decision to visual targets (i.e., written words). The auditory stimuli contained critical two word sequences which created temporal lexical ambiguity (e.g., 'mill#company', with the competitor 'milk'). There was either an IP boundary or a word boundary between the two words in the critical sequences. The initial CV of the second word (e.g., [$k_{\Lambda}$] in 'company') was spliced from another token of the sequence in IP- or Wd-initial positions. The prime words were postboundary words (e.g., company) in Experiment 1, and preboundary words (e.g., mill) in Experiment 2. In both experiments, Korean listeners showed priming effects only in IP contexts, indicating that they can make use of IP boundary cues of English in lexical segmentation of English. The acoustic correlates of domain-initial strengthening were also exploited by Korean listeners, but significant effects were found only for the segmentation of postboundary words. The results therefore indicate that L2 listeners can make use of prosodically driven phonetic detail in lexical segmentation of L2, as long as the direction of those cues are similar in their L1 and L2. The exact use of the cues by Korean listeners was, however, different from that found with native English listeners in Cho, McQueen, and Cox (2007). The differential use of the prosodically driven phonetic cues by the native and non-native listeners are thus discussed.

  • PDF

Korean ESL Learners' Perception of English Segments: a Cochlear Implant Simulation Study (인공와우 시뮬레이션에서 나타난 건청인 영어학습자의 영어 말소리 지각)

  • Yim, Ae-Ri;Kim, Dahee;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.91-99
    • /
    • 2014
  • Although it is well documented that patients with cochlear implant experience hearing difficulties when processing their first language, very little is known whether or not and to what extent cochlear implant patients recognize segments in a second language. This preliminary study examines how Korean learners of English identify English segments in a normal hearing and cochlear implant simulation conditions. Participants heard English vowels and consonants in the following three conditions: normal hearing condition, 12-channel noise vocoding with 0mm spectral shift, and 12-channel noise vocoding with 3mm spectral shift. Results confirmed that nonnative listeners could also retrieve spectral information from vocoded speech signal, as they recognized vowel features fairly accurately despite the vocoding. In contrast, the intelligibility of manner and place features of consonants was significantly decreased by vocoding. In addition, we found that spectral shift affected listeners' vowel recognition, probably because information regarding F1 is diminished by spectral shifting. Results suggest that patients with cochlear implant and normal hearing second language learners would experience different patterns of listening errors when processing their second language(s).

Egyptian learners' learnability of Korean phonemes (이집트 한국어 학습자들의 한국어 음소 학습용이성)

  • Benjamin, Sarah;Lee, Ho-Young;Hwang, Hyosung
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.19-33
    • /
    • 2019
  • This paper examines the perception of Korean phonemes by Egyptian learners of Korean and presents the learnability gradient of Korean consonants and vowels through High Variability Phonetic Training (HVPT). 50 Egyptian learners of Korean (27 low proficiency learners and 23 high proficiency learners) participated in 10 sessions of HVPT for Korean vowels, word initial and final consonants. Participants were tested on their identification ability of Korean vowels, word initial consonants, and syllable codas before and after the training. The results showed that both low and high proficiency groups did benefit from the training. Low proficiency learners showed a higher improvement rate than high proficiency learners. Based on the HVPT results, a learnability gradient was established to give insights into priorities in teaching Korean sounds to Egyptian learners.