• Title/Summary/Keyword: Vowel modification

Search Result 8, Processing Time 0.018 seconds

An Utterance Verification using Vowel String (모음 열을 이용한 발화 검증)

  • 유일수;노용완;홍광석
    • Proceedings of the Korea Institute of Convergence Signal Processing
    • /
    • 2003.06a
    • /
    • pp.46-49
    • /
    • 2003
  • The use of confidence measures for word/utterance verification has become art essential component of any speech input application. Confidence measures have applications to a number of problems such as rejection of incorrect hypotheses, speaker adaptation, or adaptive modification of the hypothesis score during search in continuous speech recognition. In this paper, we present a new utterance verification method using vowel string. Using subword HMMs of VCCV unit, we create anti-models which include vowel string in hypothesis words. The experiment results show that the utterance verification rate of the proposed method is about 79.5%.

  • PDF

A Study on Vowel Formant Variation by Vocal Tract Modification (성도 변형에 따른 모음 포먼트의 변화 고찰)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.83-92
    • /
    • 1998
  • Vowels are classified by vocal tract shapes. These shapes form constriction points along the tract, which have an influence on such vocal tract resonance as $F_l,\;F_2,\;F_3$, and so on. This study reviews the perturbation theory of the tract and determines the corresponding formant frequencies from modified vocal tracts using vocal tract area function. Then, formant variation is observed from the theory. Finally, each set of $F_l,\;F_2,\;and\;F_3$ frequency is input to a speech synthesis software to make a vowel sound. Auditory impression of each sound without any modification of its vocal tract shape is almost the same as the corresponding phonetic symbol. Formant frequencies of $F_l,\;F_2,\;F_3$ vary according to the perturbation theory. Generally, constriction along the node causes formant values to decrease while constriction along the anti-node cause it to increase. Vocal tracts modified by more than $3\;cm^2$ change vowel qualities of /a/ and /i/ into those of f /v/ and /$\varepsilon$/, respectively. This study will be helpful in simulating sounds from modified vocal tracts before any operation. Further studies are desirable to compare vocal tract shapes of various languages and their sounds together.

  • PDF

English vowel production conditioned by probabilistic accessibility of words: A comparison between L1 and L2 speakers

  • Jonny Jungyun Kim;Mijung Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.1-7
    • /
    • 2023
  • This study investigated the influences of probabilistic accessibility of the word being produced - as determined by its usage frequency and neighborhood density - on native and high-proficiency L2 speakers' realization of six English monophthong vowels. The native group hyperarticulated the vowels over an expanded acoustic space when the vowel occurred in words with low frequency and high density, supporting the claim that vowel forms are modified in accordance with the probabilistic accessibility of words. However, temporal expansion occurred in words with greater accessibility (i.e., with high frequency and low density) as an effect of low phonotactic probability in low-density words, particularly in attended speech. This suggests that temporal modification in the opposite direction may be part of the phonetic characteristics that are enhanced in communicatively driven focus realization. Conversely, none of these spectral and temporal patterns were found in the L2 group, thereby indicating that even the high-proficiency L2 speakers may not have developed experience-based sensitivity to the modulation of sub-categorical phonetic details indexed with word-level probabilistic information. The results are discussed with respect to how phonological representations are shaped in a word-specific manner for the sake of communicatively driven lexical intelligibility, and what factors may contribute to the lack of native-like sensitivity in L2 speech.

AM-FM Decomposition and Estimation of Instantaneous Frequency and Instantaneous Amplitude of Speech Signals for Natural Human-robot Interaction (자연스런 인간-로봇 상호작용을 위한 음성 신호의 AM-FM 성분 분해 및 순간 주파수와 순간 진폭의 추정에 관한 연구)

  • Lee, He-Young
    • Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.53-70
    • /
    • 2005
  • A Vowel of speech signals are multicomponent signals composed of AM-FM components whose instantaneous frequency and instantaneous amplitude are time-varying. The changes of emotion states cause the variation of the instantaneous frequencies and the instantaneous amplitudes of AM-FM components. Therefore, it is important to estimate exactly the instantaneous frequencies and the instantaneous amplitudes of AM-FM components for the extraction of key information representing emotion states and changes in speech signals. In tills paper, firstly a method decomposing speech signals into AM - FM components is addressed. Secondly, the fundamental frequency of vowel sound is estimated by the simple method based on the spectrogram. The estimate of the fundamental frequency is used for decomposing speech signals into AM-FM components. Thirdly, an estimation method is suggested for separation of the instantaneous frequencies and the instantaneous amplitudes of the decomposed AM - FM components, based on Hilbert transform and the demodulation property of the extended Fourier transform. The estimates of the instantaneous frequencies and the instantaneous amplitudes can be used for modification of the spectral distribution and smooth connection of two words in the speech synthesis systems based on a corpus.

  • PDF

Distinct Segmental Implementations in English and Spanish Prosody

  • Lee, Joo-Kyeong
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.199-206
    • /
    • 2004
  • This paper attempts to provide a substantial explanation of different prosodic implementations on segments in English and Spanish, arguing that the phonetic modification invoked by prosody may effectively reflect phonological structure. In English, a high front vowel in accented syllables is acoustically realized as higher F1 and F2 frequencies than in unaccented syllables, due to its more peripheral and sonorous articulation (Harrington et al. 1999). In this paper, an acoustic experiment was conducted to see if such a manner of segmental modification invoked by prosody in English extends to other languages such as Spanish. Results show that relatively more prominent syllables entail higher F1 values as a result of their more sonorous articulation in Spanish, but either front or back vowel does not show a higher F2 or a lower F2 frequency. This is interpreted as an indication that a prosodically prominent syllable entails its vocalic enhancement in both horizontal and vertical dimensions of articulation in English. In Spanish, however, only the vertical dimensional articulation is maximized, resulting in a higher F1. I suggest that this difference may be attributed to the different phonological structures of vowels in English and Spanish, and that sonority expansion alone would be sufficient in the articulation of prosodic prominence as long as the phonological distinction of vowels is well retained.

  • PDF

The First Formant Characteristics in Vocalize of One Soprano (소프라노 1인의 모음곡 발성 시 제 1 포먼트의 변화양상)

  • Song, Yun-Kyung;Jin, Sung-Min
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.16 no.1
    • /
    • pp.10-14
    • /
    • 2005
  • Background and Objectives : Vowels are characterized on the basis of formant patterns. The first formant(F1) is determined by high-low placement of the tongue, and the second formant (F2) by front-back placement of the tongue. The fundamental frequency(F0) of a soprano often exceed the normal frequency of the first formant. And the vocal intensity is boosted when F0 is high and a harmonic coincides with a formant. This is called a formant tuning. Experienced singers thus learned how to tune their formants over a resonable range by lowering the tongue to maximize their vocal intensity. So, the current study aimed to identify the formant tuning in one experienced soprano by comparing the first formants of vowel [i] in three different voice production : speech, ascending scale, and vocalize. Materials and Method : All voices recordings of vowel [i] in speech, ascending scale (from F4 note to A4 note), and vocalize(:Ridente la calam") were made with digital audio tape-corder in a sound treated room. And the captured data were analyzed by the long term average(LTA) power spectrum using the FFT algorithm of the Computerized Speech Lab(CSL, Kay elementrics, Model, 4300B). Results : Although the first formant of vowel [i] in speech was 238Hz, those of ascending scale [i] were 377Hz, 405Hz, 453Hz respectively in F4(349z), G4(392Hz), A4(440Hz) note, and 722Hz, 820Hz, 918Hz respectively in F5 (698Hz), G5(784Hz), A5(880Hz) note. In vocalize, first formants of [i] were 380Hz, 398Hz, 453Hz respectively in F4, G4, A4 note, and 720Hz, 821Hz, 890Hz respectively in F5, G5, A5 note. Conclusion : These results showed that the first formant of ascending scale and vocalize sustained higher frequency than fundamental frequency in high pitch. This finding implicates that the formant tuning of vowel [i] in ascending scale was also noted in vocalize.

  • PDF

C-to-V coarticulation in horizontal and vertical dimensions and its implications for phonology

  • Lee, Joo-Kyeong
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.107-121
    • /
    • 2000
  • In this paper, I investigate the acoustic correlates of a vowel's coarticulatory dynamics manifested in preceding and following consonants along two dimensions of the vocal tract: place of articulation and degree of constriction. Two dimensional coarticulation is not necessarily executed either concomitantly or proportionally, and the modification induced by coarticulation with a vowel in CVC structures is merely restricted to the CV portion; that is, the prevocalic consonant is modified solely in its constriction location. This is consistent with the observation that C-to-V place assimilation does not accompany consonant lenition in phonology, which suggests that phonetic nature is effectively reflected in phonological patterns.

  • PDF

Articulatory modification of /m/ in the coda and the onset as a function of prosodic boundary strength and focus in Korean

  • Kim, Sahyang;Cho, Taehong
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.3-15
    • /
    • 2014
  • An articulatory study (using an Electromagnetic Articulography, EMA) was conducted to explore effects of prosodic boundary strength (Intonational Phrase/IP versus Word/Wd), and focus (Focused/accented, Neutral, Unfocused/unaccented) on the kinematic realization of /m/ in the coda (${\ldots}$am#i${\ldots}$) and the onset (${\ldots}$a#mi${\ldots}$) conditions in Korean. (Here # refers to a prosodic boundary such as an IP or a Wd boundary). Several important points have emerged. First, the boundary effect on /m/s was most robustly observed in the temporal dimension in both the coda (IP-final) and the onset (IP-initial) conditions, generally in line with cross-linguistically observable boundary-related lengthening patterns. Crucially, however, in contrast with boundary-related slowing-down effects that have been observed in English, both the IP-final and IP-initial temporal expansions of Korean /m/s were not accompanied by an articulatory slowing down. They were, if anything, associated with a faster movement in the lip opening (release) phase (into the vowel). This suggests that the mechanisms underlying boundary-related temporal expansions may differ between languages. Second, observed boundary-induced strengthening effects (both spatial and temporal expansions, especially on the IP-initial /m/s) were remarkably similar to prominence (focus)-induced strengthening effects, which is again counter to phrase-initial strengthening patterns observed in English in which boundary effects are dissociated from prominent effects. This suggests that initial syllables in Korean may be a common focus for both boundary and prominence marking. These results, taken together, imply that the boundary-induced strengthening in Korean is different in nature from that in English, each being modulated by the individual language's prosodic system. Third, the coda and the onset /m/s were found to be produced in a subtly but significantly different way even in a Wd boundary condition, a potentially neutralizing (resyllabification) context. This suggests that although the coda may be phonologically 'resyllabified' into the following syllable in a phrase-medial position, its underlying syllable affiliation is kinematically distinguished from the onset.