• 제목/요약/키워드: speech cues

검색결과 117건 처리시간 0.02초

청자의 경험, 화자의 조음 중증도, 단서 유형이 인공와우이식 선천성 농 성인의 말명료도에 미치는 영향 (Effects of Listener's Experience, Severity of Speaker's Articulation, and Linguistic Cues on Speech Intelligibility in Congenitally Deafened Adults with Cochlear Implants)

  • 이영미;성지은;박정미;심현섭
    • 말소리와 음성과학
    • /
    • 제3권1호
    • /
    • pp.125-134
    • /
    • 2011
  • The current study investigated the effects of experience of deaf speech, severity of speaker's articulation, and linguistic cues on speech intelligibility of congenitally deafened adults with cochlear implants. Speech intelligibility was judged by 28 experienced listeners and 40 inexperienced listeners using a word transcription task. A three-way (2 $\times$ 2 $\times$ 4) mixed design was used with the experience of deaf speech (experienced/inexperienced listener) as a between-subject factor, the severity of speaker's articulation (mild to moderate/moderate to severe), and linguistic cues (no/phonetic/semantic/combined) as within-subject factors. The dependent measure was the number of correctly transcribed words. Results revealed that three main effects were statistically significant. Experienced listeners showed better performance on the transcription than inexperienced listeners, and listeners were better in transcribing speakers who were mild to moderate than moderate to severe. There were significant differences in speech intelligibility among the four different types of cues, showing that the combined cues provided the greatest enhancement of the intelligibility scores (combined > semantic > phonological > no). Three two-way interactions were statistically significant, indicating that the type of cues and severity of speakers differentiated experienced listeners from inexperienced listeners. The current results suggested that the use of a combination of linguistic cues increased the speech intelligibility of congenitally deafened adults with cochlear implants, and the experience of deaf speech was critical especially in evaluating speech intelligibility of severe speakers compared to that of mild speakers.

  • PDF

이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출 (Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition)

  • 신민화;박지훈;김홍국;이연우;이성로
    • 말소리와 음성과학
    • /
    • 제2권3호
    • /
    • pp.141-148
    • /
    • 2010
  • In this paper, voice activity detection (VAD) for dual-channel noisy speech recognition is proposed in which spatial cues are employed. In the proposed method, a probability model for speech presence/absence is constructed using spatial cues obtained from dual-channel input signal, and a speech activity interval is detected through this probability model. In particular, spatial cues are composed of interaural time differences and interaural level differences of dual-channel speech signals, and the probability model for speech presence/absence is based on a Gaussian kernel density. In order to evaluate the performance of the proposed VAD method, speech recognition is performed for speech segments that only include speech intervals detected by the proposed VAD method. The performance of the proposed method is compared with those of several methods such as an SNR-based method, a direction of arrival (DOA) based method, and a phase vector based method. It is shown from the speech recognition experiments that the proposed method outperforms conventional methods by providing relative word error rates reductions of 11.68%, 41.92%, and 10.15% compared with SNR-based, DOA-based, and phase vector based method, respectively.

  • PDF

The Role of Prosodic Boundary Cues in Word Segmentation in Korean

  • Kim, Sa-Hyang
    • 음성과학
    • /
    • 제13권1호
    • /
    • pp.29-41
    • /
    • 2006
  • This study investigates the degree to which various prosodic cues at the boundaries of prosodic phrases in Korean contribute to word segmentation. Since most phonological words in Korean are produced as one Accentual Phrase (AP), it was hypothesized that the detection of acoustic cues at AP boundaries would facilitate word segmentation. The prosodic characteristics of Korean APs include initial strengthening at the beginning of the phrase and pitch rise and final lengthening at the end. A perception experiment utilizing an artificial language learning paradigm revealed that cues conforming to the aforementioned prosodic characteristics of Korean facilitated listeners' word segmentation. Results also indicated that duration and amplitude cues were more helpful in segmentation than pitch. Nevertheless, results did show that a pitch cue that did not conform to the Korean AP interfered with segmentation.

  • PDF

Speech cues를 이용한 반복훈련이 뇌성마비 아동의 자음정확도 및 말명료도에 미치는 영향: 단일대상연구 (The effects of repeated speech training using speech cues on the percentage of correct consonants and speech intelligibility in children with cerebral palsy: A single-subject design research)

  • 서새희;정필연;심현섭
    • 말소리와 음성과학
    • /
    • 제13권3호
    • /
    • pp.79-90
    • /
    • 2021
  • 본 연구에서는 단일대상연구로서 speech cues를 이용한 반복훈련이 뇌성마비 아동의 자음정확도 및 말명료도에 미치는 영향에 대해 알아보고자 하였다. 연구에는 만 5-8세의 뇌성마비 아동 3명이 참여하였다. 중재는 한 달간 주 4회, 총 16회기 동안 진행하였으며, 한 회기는 30분으로 구성하였다. 훈련과제는 목표 음소가 포함된 1-2음절의 단어와 2어절의 문장을 speech cues의 두 가지 발화유형인 'Big mouth'와 'Strong voice'로 훈련하였다. 연구 결과, 첫째, 세 아동 모두 자음정확도와 말명료도의 평균이 중재단계에서 증가하였으나, 효과크기는 아동 간 상이하였다. 또한, 자음정확도에 비해 말명료도에서 더 높은 효과가 나타났다. 세 아동 모두 훈련 문항에서 유지 효과도 나타났다. 둘째, 세 아동 모두 비훈련 단어와 문장에서 일반화 효과가 나타났다. 따라서 speech cues를 이용한 반복훈련을 통해 뇌성마비 아동의 자음정확도와 말명료도가 증가된 것을 알 수 있었으며, 임상에서 좀 더 쉽고 간편한 중재로 그 효용성이 있다는 것을 확인하였다.

한국어 화자의 영어 양순음 /b/와 순치음 /v/ 식별에서 시각 단서의 효과 (The Effect of Visual Cues in the Identification of the English Consonants /b/ and /v/ by Native Korean Speakers)

  • 김윤현;고성룡
    • 말소리와 음성과학
    • /
    • 제4권3호
    • /
    • pp.25-30
    • /
    • 2012
  • This study investigated whether native Korean listeners could use visual cues for the identification of the English consonants /b/ and /v/. Both auditory and audiovisual tokens of word minimal pairs in which the target phonemes were located in word-initial or word-medial position were used. Participants were instructed to decide which consonant they heard in $2{\times}2$ conditions: cue (audio-only, audiovisual) and location (word-initial, word-medial). Mean identification scores were significantly higher for audiovisual than audio-only condition and for word-initial than word-medial condition. Also, according to signal detection theory, sensitivity, d', and response bias, c were calculated based on both hit rates and false alarm rates. The measures showed that the higher identification rate in the audiovisual condition was related with an increase in sensitivity. There were no significant differences in response bias measures across conditions. This result suggests that native Korean speakers can use visual cues while identifying confusing non-native phonemic contrasts. Visual cues can enhance non-native speech perception.

Perceptual weighting on English lexical stress by Korean learners of English

  • Goun Lee
    • 말소리와 음성과학
    • /
    • 제14권4호
    • /
    • pp.19-24
    • /
    • 2022
  • This study examined which acoustic cue(s) that Korean learners of English give weight to in perceiving English lexical stress. We manipulated segmental and suprasegmental cues in 5 steps in the first and second syllables of an English stress minimal pair "object". A total of 27 subjects (14 native speakers of English and 13 Korean L2 learners) participated in the English stress judgment task. The results revealed that native Korean listeners used the F0 and intensity cues in identifying English stress and weighted vowel quality most strongly, as native English listeners did. These results indicate that Korean learners' experience with these cues in L1 prosody can help them attend to these cues in their L2 perception. However, L2 learners' perceptual attention is not entirely predicted by their linguistic experience with specific acoustic cues in their native language.

감정 표현 방법: 운율과 음질의 역할 (How to Express Emotion: Role of Prosody and Voice Quality Parameters)

  • 이상민;이호준
    • 한국컴퓨터정보학회논문지
    • /
    • 제19권11호
    • /
    • pp.159-166
    • /
    • 2014
  • 본 논문에서는 감정을 통해 단어의 의미가 변화될 때 운율과 음질로 표현되는 음향 요소가 어떠한 역할을 하는지 분석한다. 이를 위해 6명의 발화자에 의해 5가지 감정 상태로 표현된 60개의 데이터를 이용하여 감정에 따른 운율과 음질의 변화를 살펴본다. 감정에 따른 운율과 음질의 변화를 찾기 위해 8개의 음향 요소를 분석하였으며, 각 감정 상태를 표현하는 주요한 요소를 판별 해석을 통해 통계적으로 분석한다. 그 결과 화남의 감정은 음의 세기 및 2차 포먼트 대역너비와 깊은 연관이 있음을 확인할 수 있었고, 기쁨의 감정은 2차와 3차 포먼트 값 및 음의 세기와 연관이 있으며, 슬픔은 음질 보다는 주로 음의 세기와 높낮이 정보에 영향을 받는 것을 확인할 수 있었으며, 공포는 음의 높낮이와 2차 포먼트 값 및 그 대역너비와 깊은 관계가 있음을 알 수 있었다. 이러한 결과는 감정 음성 인식 시스템뿐만 아니라, 감정 음성 합성 시스템에서도 적극 활용될 수 있을 것으로 예상된다.

Effects of base token for stimuli manipulation on the perception of Korean stops among native and non-native listeners

  • Oh, Eunjin
    • 말소리와 음성과학
    • /
    • 제12권1호
    • /
    • pp.43-50
    • /
    • 2020
  • This study investigated whether listeners' perceptual patterns varied according to base token selected for stimuli manipulation. Voice onset time (VOT) and fundamental frequency (F0) values were orthogonally manipulated, each in seven steps, using naturally produced words that contained a lenis (/kan/) and an aspirated (/khan/) stop in Seoul Korean. Both native and non-native groups showed significantly higher numbers of aspirated responses for the stimuli constructed with /khan/, evidencing the use of minor cues left in the stimuli after manipulation. For the native group the use of the VOT and F0 cues in the stop categorization did not differ depending on whether the base token included the lenis or aspirated stop, indicating that the results of previous studies remain tenable that investigated the relative importance of the acoustic cues in the native listener perception of the Korean stop contrasts by using one base token for manipulating perceptual stimuli. For the non-native group, the use patterns of the F0 cue differed as a function of base token selected. Some findings indicated that listeners used alternative cues to identify the stop contrast when major cues sound ambiguous. The use of the manipulated VOT and F0 cues by the non-native group was not native-like, suggesting that non-native listeners may have perceived the minor cues as stable in the context of the manipulated cue combinations.

한국어/영어 이중언어사용 아동의 한국어 문장이해: 조사, 의미, 어순 단서의 활용을 중심으로 (Korean Sentence Comprehension of Korean/English Bilingual Children)

  • 황민아
    • 음성과학
    • /
    • 제10권4호
    • /
    • pp.241-254
    • /
    • 2003
  • The purpose of the present study was to investigate the sentence comprehension strategies used by Korean/English bilingual children when they listened to sentences of their first language, i.e., Korean. The framework of competition model was employed to analyze the influence of the second language, i.e., English, during comprehension of Korean sentences. The participants included 10 bilingual children (ages 7;4-13;0) and 20 Korean-speaking monolingual children(ages 5;7-6;10) with similar levels of development in Korean language as bilingual children. In an act-out procedure, the children were asked to determine the agent in sentences composed of two nouns and a verb with varying conditions of three cues (case-marker, animacy, and word-order). The results revealed that both groups of children used the case marker cues as the strongest cue among the three. The bilingual children relied on case-marker cues even more than the monolingual children. However, the bilingual children used animacy cues significantly less than the monolingual children. There were no significant differences between the groups in the use of word-order cues. The bilingual children appeared less effective in utilizing animacy cues in Korean sentence comprehension due to the backward transfer from English where the cue strength of animacy is very weak. The influence of the second language on the development of the first language in bilingual children was discussed.

  • PDF

한국어 파열음의 자동 인식에 대한 연구 : 한국어 치경 파열음의 자동 분류에 관한 연구 (A Study On The Automatic Discrimination Of The Korean Alveolar Stops)

  • 최윤석;김기석;황희융
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1987년도 정기총회 및 창립40주년기념 학술대회 학회본부
    • /
    • pp.330-333
    • /
    • 1987
  • This paper is the study on the automatic discrimination of the Korean alveolar stops. In Korean, it is necessary to discriminate the asperate/tense plosive for the automatic speech recognition system because we, Korean, distinguish asperate/tense plosive allphones from tense and lax plosive. In order to detect acoustic cues for automatic recognition of the [ㄲ, ㄸ, ㅃ], we have experimented the discrimination of [ㄷ,ㄸ,ㅌ]. We used temporal cues like VOT and Silence Duration, etc., and energy cues like ratio of high frequency energy and low frequency energy as the acoustic parameters. The VCV speech data where V is the 8 Simple Vowels and C is the 3 alevolar stops, are used for experiments. The 192 speech data are experimented on and the recognition rate is resulted in about 82%-95%.

  • PDF