• Title/Summary/Keyword: Phonetics

Search Result 948, Processing Time 0.022 seconds

A Study of Acoustic Analysis in the Chinese' Korean Language Learners (중국인 한국어 학습자 음성의 음향학적 특성 연구)

  • Kim, Hyun-Ji;You, Jae-Yeon
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.75-80
    • /
    • 2010
  • The present research investigated the characteristics of voice between genders and nationalities by measuring the acoustic parameter values of Korean and Chinese students. Sound Forge was used to collect voice samples and Praat was used to measure and analyze jitter, shimmer, NHR, $sF_0$, and pitch range. The results of this research are a follows. First, during prolongation of the vowels, there was no significant difference in $F_0$ between Korean and Chinese males and Korean and Chinese females. Korean males and females had higher $F_0$ values than Chinese males and females. Secondly, during sentence reading, there was no significant difference between Korean and Chinese males in $sF_0$. But between female groups, there was a significant difference in $sF_0$. Thirdly, during sentence reading, the pitch range in Korean males was found to be narrower compared to Korean and Chinese females who had wider pitch range, showing a significant difference. Fourthly, jitter in the five vowels /a, i, u, e, o/ was found to be higher in Chinese than Korean subjects. In the vowels /a, e, u/ females were higher than males showing a significant difference. Fifthly, shimmer in the vowels /a, e, u/ was found to be higher in Chinese than Korean subjects showing a significant difference. Finally, NHR in the vowels /a, u, o/ was found to be higher in Chinese than Korean subjects showing a significant difference.

  • PDF

Building a Sentential Model for Automatic Prosody Evaluation

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.47-59
    • /
    • 2009
  • The purpose of this paper is to propose an automatic evaluation technique for the prosodic aspect of an English sentence uttered by Korean speakers learning English. The underlying hypothesis is that the consistency of the manual prosody scoring is reflected in an imaginary space of prosody evaluation model constructed out of the three physical properties of the prosody considered in this paper, namely: the fundamental frequency (F0) contour, the intensity contour, and the segmental durations. The evaluation proceeds first by building a prosody evaluation model for the sentence. For the creation of the model, utterances from native speakers of English and Korean learners for the target sentence are manually scored by either native teachers of English or Korean phoneticians in terms of their prosody. Multiple native utterances from the manual scoring are selected as the "model" native utterances against which all the other Korean learners' utterances as well as the model utterances themselves can be semi-automatically evaluated by comparison in terms of the three prosodic aspects [7]. Each learner utterance, when compared to the multiple model native utterances, produces multiple coordinates in a three-dimensional space of prosody evaluation, each axis of which corresponds to the three prosodic aspects. The 3D coordinates from all the comparisons form a prosody evaluation model for the particular sentence and the associated manual scores can display regions of particular scores. The model can then be used as a predictive model against which other Korean utterances of the target sentence can be evaluated. The model from a Korean phonetician appears to support the hypothesis.

  • PDF

Glottal Characteristics of Word-initial Vowels in the Prosodic Boundary: Acoustic Correlates (운율경계에 위치한 어두 모음의 성문 특성: 음향적 상관성을 중심으로)

  • Sohn, Hyang-Sook
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.47-63
    • /
    • 2010
  • This study provides a description of the glottal characteristics of the word-initial low vowels /a, $\ae$/ in terms of a set of acoustic parameters and discusses glottal configuration as their acoustic correlates. Furthermore, it examines the effect of prosodic boundary on the glottal properties of the vowels, seeking an account of the possible role of prosodic structure based on prosodic theory. Acoustic parameters reported to indicate glottal characteristics were obtained from the measurements made directly from the speech spectrum on recordings of Korean and English collected from 45 speakers. They consist of two separate groups of native Korean and native English speakers, each including both male and female speakers. Based on the three acoustic parameters of open quotient (OQ), first-formant bandwidth (B1), and spectral tilt (ST), comparisons were made between the speech of males and females, between the speech of native Korean and native English speakers, and between Korean and English produced by native Korean speakers. Acoustic analysis of the experimental data indicates that some or all glottal parameters play a crucial role in differentiating the speech groups, despite substantial interspeaker variations. Statistical analysis of the Korean data indicates prosodic strengthening with respect to the acoustic parameters B1 and OQ, suggesting acoustic enhancement in terms of the degree of glottal abduction and the glottal closure during a vibratory cycle.

  • PDF

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook;Lee, Yeon-Woo;Lee, Seong-Ro
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.141-148
    • /
    • 2010
  • In this paper, voice activity detection (VAD) for dual-channel noisy speech recognition is proposed in which spatial cues are employed. In the proposed method, a probability model for speech presence/absence is constructed using spatial cues obtained from dual-channel input signal, and a speech activity interval is detected through this probability model. In particular, spatial cues are composed of interaural time differences and interaural level differences of dual-channel speech signals, and the probability model for speech presence/absence is based on a Gaussian kernel density. In order to evaluate the performance of the proposed VAD method, speech recognition is performed for speech segments that only include speech intervals detected by the proposed VAD method. The performance of the proposed method is compared with those of several methods such as an SNR-based method, a direction of arrival (DOA) based method, and a phase vector based method. It is shown from the speech recognition experiments that the proposed method outperforms conventional methods by providing relative word error rates reductions of 11.68%, 41.92%, and 10.15% compared with SNR-based, DOA-based, and phase vector based method, respectively.

  • PDF

A Phonetic Investigation of Korean Monophthongs in the Early Twentieth Century (20세기 초 한국어 단모음의 음향음성학적 연구)

  • Han, Jeong-Im;Kim, Joo-Yeon
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.31-38
    • /
    • 2014
  • The current study presents an instrumental phonetic analysis of Korean monophthong vowels in the early twentieth century Seoul Korean, based on audio recordings of elementary school textbooks Botonghakgyo Joseoneodokbon (Korean Reading Textbook for Elementary School). The data examined in this study were a list of the Korean mono syllables (Banjeol), and a short passage, recorded by one 41-year-old male speaker in 1935, as well as a short passage recorded by one 11-year-old male speaker in 1935. The Korean monophthongs were examined in terms of acoustic analysis of the vowel formants (F1, F2) and compared to those recorded by 18 male speakers of Seoul Korean in 2013. The results show that in 1935, 1) /e/ and /ɛ/ were clearly separated in the vowel space; 2) /o/ and /u/ were also clearly separated without any overlapping values; 3) some tokens of /y/ and /ø/ were produced as monophthongs, not as diphthongs. Based on the results, we can observe the historical change of the Korean vowels over 80-90 years such as 1) /e/ and /ɛ/ have been merged; and 2) /o/ has been raised and overlapped with /u/.

Perceptual Characteristics of Korean Consonants Distorted by the Frequency Band Limitation (주파수 대역 제한에 의한 한국어 자음의 지각 특성 분석)

  • Kim, YeonWhoa;Choi, DaeLim;Lee, Sook-Hyang;Lee, YongJu
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.95-101
    • /
    • 2014
  • This paper investigated the effects of frequency band limitation on perceptual characteristics of Korean consonants. Monosyllabic speech (144 syllables of CV type, 56 syllables of VC type, 8 syllables of V type) produced by two announcers were low- and high-pass filtered with cutoff frequencies ranging from 300 to 5000 Hz. Six listeners with normal hearing performed perception test by types of filter and cutoff frequencies. We reported phoneme recognition rates and types of perception error of band-limited Korean consonants to examine how frequency distortion in the process of speech transmission affect listener's perception. The results showed that recognition rates varied with the following factors: position in a syllable, manner of articulation, place of articulation, and phonation types. Consonants in the final position were stronger to the frequency band limitation than those in the initial position. Fricatives and Affricates are stronger than stops. Fortis consonants were less stronger than their lenis or aspirated counterparts. Types of perception error also varied depending on such factors as consonant's place of articulation: In case of bilabial stops, they were perceived as alveolar stops with while in cases of alveolar and velar stops, there were changes in phonation types without any change in the place of articulation.

Sensitive Period of Auditory Perception and Linguistic Discrimination

  • Cha, Kyung-Whan;Jo, Hannah
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.59-67
    • /
    • 2014
  • The purpose of this study is to scientifically examine Kuhl's (2011), originally Johnson and Newport's (1989) critical period graph, from a perspective of auditory perception and linguistic discrimination. This study utilizes two types of experiments (auditory perception and linguistic phoneme discrimination) with five different age groups (5 years, 6-8 years, 9-13 years, 15-17 years, and 20-26 years) of Korean English learners. Auditory perception is examined via ultrasonic sounds that are commonly used in the medical field. In addition, each group is measured in terms of their ability to discriminate minimal pairs in Chinese. Since almost all Korean students already have some amount of English exposure, the researchers selected phonemes in Chinese, an unexposed foreign language for all of the subject groups. The results are almost completely in accordance with Kuhl's critical period graph for auditory perception and linguistic discrimination; a sensitive age is found at 8. The results show that the auditory capability of kindergarten children is significantly better than that of other students, measured by their ability to perceive ultrasonic sounds and to distinguish ten minimal pairs in Chinese. This finding strongly implies that human auditory ability is a key factor for the sensitive period of language acquisition.

Korean Intonation Patterns from the Viewpoint of F0 Percentage Change (F0 변화율로 본 한국어 억양 패턴의 음향 특성)

  • Lee, Ji Yeon;Lee, Ho-Young
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.123-130
    • /
    • 2013
  • Previous researches on Korean intonation have been mainly focused on $F_0$ target frequencies, $F_0$ slope, and the duration of intonation patterns. This study investigated Korean intonation patterns, both boundary and phrasal tones, in relation to the $F_0$ percentage change between pitch targets. We measured the percentage change between the pitch targets of both boundary and phrasal tones. Additionally, the $F_0$ change between the preceding pitch target and the first pitch target of the boundary tone and the $F_0$ targets of the sequence of two LH phrasal tones ('LH + LH') were also measured. Two phrasal tones, LHLH and HLH, were compared with 'LH + LH' and the 'HLH' in the LHLH pattern respectively. We found that the percentage change between pitch targets in the phrasal tone is fixed to some extent. This helped explain why the slope of the phrasal tone is closely related to the number of syllables and the duration of the phrasal tone as discussed in previous studies. Since we analyzed the intonation patterns with the utterances from a large speech corpus, the results of this paper are expected to be used in building a larger annotated corpus of Korean.

Prosodic Modifications of the Internal Phonetic Structure of Monosyllabic CVC Words in Conversational Speech

  • Mo, Yoonsook
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.99-108
    • /
    • 2013
  • Previous laboratory studies have shown that prosodic structures are encoded in the modulations of phonetic patterns of speech including suprasegmental as well as segmental features. In particular, effects of prosodic context on duration and intensity of syllables and words have been widely reported. Drawing on prosodically annotated large-scale speech data from the Buckeye corpus of conversational speech of American English, the current study attempted to examine whether and how prosodic prominence and phrase boundary of everyday conversational speech, as determined by a large group of ordinary listeners, are related to the phonetic realization of duration and intensity. The results showed that the patterns of word durations and intensities are influenced by prosodic structure. Closer examinations revealed, however, that the effects of prosodic prominence are not the same as those of prosodic phrase boundary. With regard to intensity measures, the results revealed the systematic changes in the patterns of overall RMS intensity near prosodic phrase boundary but the prominence effects are restricted to the nucleus. In terms of duration measures, both prosodic prominence and phrase boundary are the most closely related to the lengthening of the nucleus. Yet, prosodic prominence is more closely related to the lengthening of the onset while phrase boundary lengthens the coda duration more. The findings from the current study suggest that the phonetic realizations of prosodic prominence are different from those of prosodic phrase boundary, and speakers signal different prosodic structures through deliberate modulations of the internal phonetic structure of words and listeners attend to such phonetic variations.

The Internal Structure of an Identification Function in Korean Lexical Pitch Accent in North Kyungsang Dialect

  • Kim, Jungsun
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.91-98
    • /
    • 2013
  • This paper investigated Korean prosody as it relates to graded internal structure in an identification function. Within Korean prosody, variants regarded as dialectal variations can appear as different prosodic scales, which contain the range of within-category variations. The current experiment was intended to show how the prosodic scale corresponding to the range of within-category differences relates to f0 contours for speakers of two Korean dialects, North Kyungsang and South Cholla. In an identification task, participants responded by selecting an item from two answer choices. The probability of choosing the correct response from the two choices was computed by a logistic regression analysis using intercepts and slopes. That is, the correct response between two choices was used to show a linear line with an s-shape presentation. In this paper, to investigate the graded internal structure of labeling, 25%, 50%, and 75% of predicted probability were assessed. Listeners from North Kyungsang showed progressive variations, whereas listeners from South Cholla revealed random patterns in the internal structure of the identification function. In this paper, the results were plotted using scatterplot graphs, applying the range of within-category variation and predicted probability obtained from the logistic regression analyses. The scatterplot graphs showed the different degree of the responses for f0 scales (i.e., variations within categories). The results demonstrate that the gradient structures of native pitch accent users become more progressive in response to f0 scales.