• Title/Summary/Keyword: fundamental frequency of speech

검색결과 205건 처리시간 0.019초

Building a Sentential Model for Automatic Prosody Evaluation

  • 윤규철
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.47-59
    • /
    • 2009
  • The purpose of this paper is to propose an automatic evaluation technique for the prosodic aspect of an English sentence uttered by Korean speakers learning English. The underlying hypothesis is that the consistency of the manual prosody scoring is reflected in an imaginary space of prosody evaluation model constructed out of the three physical properties of the prosody considered in this paper, namely: the fundamental frequency (F0) contour, the intensity contour, and the segmental durations. The evaluation proceeds first by building a prosody evaluation model for the sentence. For the creation of the model, utterances from native speakers of English and Korean learners for the target sentence are manually scored by either native teachers of English or Korean phoneticians in terms of their prosody. Multiple native utterances from the manual scoring are selected as the "model" native utterances against which all the other Korean learners' utterances as well as the model utterances themselves can be semi-automatically evaluated by comparison in terms of the three prosodic aspects [7]. Each learner utterance, when compared to the multiple model native utterances, produces multiple coordinates in a three-dimensional space of prosody evaluation, each axis of which corresponds to the three prosodic aspects. The 3D coordinates from all the comparisons form a prosody evaluation model for the particular sentence and the associated manual scores can display regions of particular scores. The model can then be used as a predictive model against which other Korean utterances of the target sentence can be evaluated. The model from a Korean phonetician appears to support the hypothesis.

  • PDF

Individual differences in categorical perception: L1 English learners' L2 perception of Korean stops

  • Kong, Eun Jong
    • 말소리와 음성과학
    • /
    • 제11권4호
    • /
    • pp.63-70
    • /
    • 2019
  • This study investigated individual variability of L2 learners' categorical judgments of L2 stops by exploring English learners' perceptual processing of two acoustic cues (voice onset time [VOT] and f0) and working memory capacity as sources of variation. As prior research has reported that English speakers' greater use of the redundant cue f0 was responsible for gradient processing of native stops, we examined whether the same processing characteristics would be observed in L2 learners' perception of Korean stops (/t/-/th/). 22 English learners of L2 Korean with a range of L2 proficiency participated in a visual analogue scaling task and demonstrated variable manners of judging the L2 Korean stops: Some were more gradient than others in performing the task. Correlation analysis revealed that L2 learners' categorical responses were modestly related to individuals' utilizations of a primary cue for the stop contrast (VOT for L1 English stops and f0 for L2 Korean stops), and were also related to better working memory capacity. Together, the current experimental evidence demonstrates adult L2 learners' top-down processing of stop consonants where linguistic and cognitive resources are devoted to a process of determining abstract phonemic identity.

갑상선 수술범위에 따른 음성의 음향적 분석 (Acoustic Analysis of Voice Change According to Extent of Thyroidectomy)

  • 강영애;구본석
    • 말소리와 음성과학
    • /
    • 제7권4호
    • /
    • pp.77-83
    • /
    • 2015
  • Voice complication without the laryngeal nerve injury can occur after thyroidectomy. The purpose of this study is to investigate voice changes according to extent of thyroidectomy with acoustic analysis. Thirty-five female patients with papillary thyroid carcinoma took voice evaluation at before and 1 month, and 3 months after thyroidectomy. Acoustic analysis parameters were speaking fundamental frequency(SFF), min $F_0$, max $F_0$, dynamic range $F_0$, jitter, shimmer, noise-to-harmonic ratio(NHR), and Cepstral prominence peak(CPP). Repeated-measured analysis of variance was applied. Time-related voice changes showed significant differences in all parameters except NHR. At 1 month after surgery, voice quality was worse and pitch was decreasing, but voice quality and pitch were improving at 3-month follow-up. Voice changes according to the extent of surgery were in SFF, max $F_0$, and dynamic range $F_0$. Time by surgery-related voice change existed only in min $F_0$. The result showed that the severity of voice complication depended on the extend of thyroidectomy which had a negative impact on $F_0$-related parameters. The deterioration of voice quality at 1 month after thyroidectomy may be affected by the loss of thyroid hormone in the blood. The descent of $F_0$-related parameters may be impacted by laryngeal fixation of surgical site adhesion.

Effects of base token for stimuli manipulation on the perception of Korean stops among native and non-native listeners

  • Oh, Eunjin
    • 말소리와 음성과학
    • /
    • 제12권1호
    • /
    • pp.43-50
    • /
    • 2020
  • This study investigated whether listeners' perceptual patterns varied according to base token selected for stimuli manipulation. Voice onset time (VOT) and fundamental frequency (F0) values were orthogonally manipulated, each in seven steps, using naturally produced words that contained a lenis (/kan/) and an aspirated (/khan/) stop in Seoul Korean. Both native and non-native groups showed significantly higher numbers of aspirated responses for the stimuli constructed with /khan/, evidencing the use of minor cues left in the stimuli after manipulation. For the native group the use of the VOT and F0 cues in the stop categorization did not differ depending on whether the base token included the lenis or aspirated stop, indicating that the results of previous studies remain tenable that investigated the relative importance of the acoustic cues in the native listener perception of the Korean stop contrasts by using one base token for manipulating perceptual stimuli. For the non-native group, the use patterns of the F0 cue differed as a function of base token selected. Some findings indicated that listeners used alternative cues to identify the stop contrast when major cues sound ambiguous. The use of the manipulated VOT and F0 cues by the non-native group was not native-like, suggesting that non-native listeners may have perceived the minor cues as stable in the context of the manipulated cue combinations.

Closure Duration and Pitch as Phonetic Cues to Korean Stop Identity in AP-medial Position: Perception Test

  • Kang, Hyun-Sook;Dilley, Laura
    • 음성과학
    • /
    • 제14권4호
    • /
    • pp.25-39
    • /
    • 2007
  • The present study investigated some perceptual phonetic attributes of two Korean stop types, aspirated and lax, in medial position of an accentual phrase. The intonational pattern across syllables (Jun, 1993) is argued to depend on the type of stop (aspirated vs. lax) only in the initial position of an accentual phrase. In Kang & Dilley (2007), we showed that significant differences between aspirated and lax stops in medial position of an accentual phrase exist in closure duration, voice-onset time, and fundamental frequency (F0) values for post-stop vowels. In the present perception experiment, we investigated whether these phonetic attributes contribute to the perception of these two types of stops: The closure durations and/or F0's of post-stop vowels on accentual-phrase medial words were altered and twenty native Korean speakers then judged these words as beginning with an aspirated or lax stop. Both closure duration and F0 significantly affected judgments of stop identity. These results indicate that a wider range of acoustic cues that distinguish aspirated and lax Korean stops in production also plays a role in perception. To account for these results we suggest some phonetic and phonological models of consonant-tone interactions for Korean.

  • PDF

Mieko Han의 한국어 음성학 연구 (Mieko Han and her Works on Korean Phonetics)

  • 고도흥
    • 음성과학
    • /
    • 제1권
    • /
    • pp.213-223
    • /
    • 1997
  • This paper deals with a general review of Mieko S. Han, who made a significant contribution to the studies of Korean phonetics during the 1960' s and early 1970' s. As both a single and joint author, Dr. Han published important papers in both quantity and quality, which have been cited among Korean phoneticians until today. Before Dr. M. Han' s work, professor of USC in the department of East Asian Languages & Cultures, there were only a few phonetics-related publications in Korea, most of which are papers or books based on non-experimental traditional approach. It is known that there was coexistence between traditionalism and structuralism in the field of Korean linguistics. It was, however, fortunate that we had two important phoneticians (M. Han and Chin-W Kim) abroad at that time. Mieko Han' s concern was to investigate experimental characteristics of the system of Korean vowels and consonants using a Spectrograph, which was the single most important tool for analysing phonetic data at that time. Dr. Han conducted her experimental studies on Korean phonetics, mostly funded by the Office of Naval Research, in terms of duration, fundamental frequency, Voice Onset Time (VOT), intensity, and so on. This paper aims to re-appreciate Dr. Han's specific contribution to the study of Korean phonetics since she played an important role as a pioneer of early Korean phonetics. Further, it is highly recommended that Dr. Han's works can be extremely useful for a graduate student, who seriously would like to specialize in Korean phonetics in the first step.

  • PDF

Phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary

  • Yang, Byunggon
    • 말소리와 음성과학
    • /
    • 제8권2호
    • /
    • pp.11-16
    • /
    • 2016
  • This study explores the phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary to provide phoneticians and linguists with fundamental phonetic data on English word components. Entry words in the dictionary file were syllabified using an R script and examined to obtain the following results: First, English words preferred consonants to vowels in their word components. In addition, monophthongs occurred much more frequently than diphthongs. When all consonants were categorized by manner and place, the distribution indicated the frequency order of stops, fricatives, and nasals according to manner and that of alveolars, bilabials and velars according to place. These results were comparable to the results obtained from the Buckeye Corpus (Yang, 2012). Second, from the analysis of syllable structure, two-syllable words were most favored, followed by three- and one-syllable words. Of the words in the dictionary, 92.7% consisted of one, two or three syllables. This result may be related to human memory or decoding time. Third, the English words tended to exhibit discord between onset and coda consonants and between adjacent vowels. Dissimilarity between the last onset and the first coda was found in 93.3% of the syllables, while 91.6% of the adjacent vowels were different. From the results above, the author concludes that an analysis of the phonetic symbols in a dictionary may lead to a deeper understanding of English word structures and components.

성대 기능 훈련이 성대결절 환자의 음성개선에 미치는 효과 (The Effect of Vocal Function Exercise on Voice Improvement in Patients with Vocal Nodules)

  • 임혜진;김정규;권도하;박준영
    • 말소리와 음성과학
    • /
    • 제1권2호
    • /
    • pp.37-42
    • /
    • 2009
  • The purpose of the present study was to determine the effect of the management program known as vocal function exercise (VFE) on voice quality. Typical VFE was modified and applied to patients with vocal nodules by controlling intensity of voice and relieving the vocal fold to solve hyperfunctional problems in VFE. Eight female subjects aged between 28 and 54 who had been diagnosed with vocal nodules took part in the study. The patients performed VFEs once a week for eight weeks. Vocal function exercises consist of voice hygiene, respiratory training, phonation training, and glide training. The subjects' voices were analyzed pre and post therapy on the aspects of acoustics, maximum phonation time (MPT), GRBAS, and voice handicap index (VHI). As a result, it was found that fundamental frequency ($F_o$) was significant increased, shimmer decreased remarkably and that noise to harmonic ratio (NHR) lowered obviously in the acoustic parameter. In addition, MPT was increased significantly. The scale of GRBAS indicated significant improvement in grade, roughness, and strained voice. VHI indicated significant improvement in an emotional part. In conclusion, VFE was effective in improving voice quality for patients with vocal nodules.

  • PDF

내전형 경련성발성장애의 호흡압력과 공기역학적 특징 (The Aerodynamic & Respiratory Muscle Pressure Aspects of Patients with Adductor Spasmodic Dysphonia)

  • 남도현;최성희;최재남;최홍식
    • 음성과학
    • /
    • 제12권4호
    • /
    • pp.203-213
    • /
    • 2005
  • This study was conducted to investigate the respiratory and aerodynamic function of adductor spasmodic dysphonia (ADSD) patients. Participants were (1) 18 females SD patients with non- Botulinum toxin injection (2) 14 females SD patients who had taken treatment of Botulinum toxin injection. (3) 14 age- and sex- matched normal female controls. Spirometer and phonatory function analyzer were used for respiratory muscle pressure (MIP: Maximum inspiratory pressure), MEP: Maximum expiratory pressure)& MPT(Maximum phonation time) and aerodynamic(F0:Fundamental frequency, intensity, MFR: Mean flow late, Psub: Subglottal pressure) measurement. The results were as follows: (1) Normal group was significantly higher in MIP, MEP, MPT than two SD groups (p < .05); (2) MPT was significantly lower in SD with non-Botulinum toxin injection group than SD with the treatment experience of Botulinum toxin injection (p < .05); (3) All aerodynamic parameters, F0, intensity, MFR, Psub, were not significantly different among three groups(p > .05).The reason of short MPT in ADSD may use lower respiratory pressure than normal group as strategy to decrease their tremulous voice quality. Moreover respiratory muscle pressure was lower than normal group regardless of botulinum toxin injection treatment.

  • PDF

L2 Proficiency Effect on the Acoustic Cue-Weighting Pattern by Korean L2 Learners of English: Production and Perception of English Stops

  • Kong, Eun Jong;Yoon, In Hee
    • 말소리와 음성과학
    • /
    • 제5권4호
    • /
    • pp.81-90
    • /
    • 2013
  • This study explored how Korean L2 learners of English utilize multiple acoustic cues (VOT and F0) in perceiving and producing the English alveolar stop with a voicing contrast. Thirty-four 18-year-old high-school students participated in the study. Their English proficiency level was classified as either 'high' (HEP) or 'low' (LEP) according to high-school English level standardization. Thirty different synthesized syllables were presented in audio stimuli by combining a 6-step VOTs and a 5-step F0s. The listeners judged how close the audio stimulus was to /t/ or /d/ in L2 using a visual analogue scale. The L2 /d/ and /t/ productions collected from the 22 learners (12 HEP, 10 LEP) were acoustically analyzed by measuring VOT and F0 at the vowel onset. Results showed that LEP listeners attended to the F0 in the stimuli more sensitively than HEP listeners, suggesting that HEP listeners could inhibit less important acoustic dimensions better than LEP listeners in their L2 perception. The L2 production patterns also exhibited a group-difference between HEP and LEP in that HEP speakers utilized their VOT dimension (primary cue in L2) more effectively than LEP speakers. Taken together, the study showed that the relative cue-weighting strategies in L2 perception and production are closely related to the learner's L2 proficiency level in that more proficient learners had a better control of inhibiting and enhancing the relevant acoustic parameters.