• 제목/요약/키워드: Articulatory

검색결과 153건 처리시간 0.025초

Patterns of consonant deletion in the word-internal onset position: Evidence from spontaneous Seoul Korean speech

  • Kim, Jungsun;Yun, Weonhee;Kang, Ducksoo
    • 말소리와 음성과학
    • /
    • 제8권1호
    • /
    • pp.45-51
    • /
    • 2016
  • This study examined the deletion of onset consonant in the word-internal structure in spontaneous Seoul Korean speech. It used the dataset of speakers in their 20s extracted from the Korean Corpus of Spontaneous Speech (Yun et al., 2015). The proportion of deletion of word-internal onset consonants was analyzed using the linear mixed-effects regression model. The factors that promoted the deletion of onsets were primarily the types of consonants and their phonetic contexts. The results showed that onset deletion was more likely to occur for a lenis velar stop [k] than the other consonants, and in the phonetic contexts, when the preceding vowel was a low central vowel [a]. Moreover, some speakers tended to more frequently delete onset consonants (e.g., [k] and [n]) than other speakers, which reflected individual differences. This study implies that word-internal onsets undergo a process of gradient reduction within individuals' articulatory strategies.

청각장애자를 위한 원격조음훈련시스템의 개발 (Remote Articulation Training System for the Deafs)

  • 이재혁;유선국;박상희
    • 대한후두음성언어의학회지
    • /
    • 제7권1호
    • /
    • pp.43-49
    • /
    • 1996
  • In this study, remote articulation training system which connects the hearing disabled trainee and the speech therapist via B-ISDN is introduced. The hearing disabled does not have the hearing feedback of his own pronuciation, and the chance of watching his speech organs movement trajectory will offer him the self-training of articulation. So the system has two purposes of self articulation training and trainer's on-line checking in remote place. We estimate the vocal tract articultory movements from the speech signal using inverse modelling and display the movement trajectoy on the sideview of human face graphically. The trajectories of trainees articulation is displayed along with the reference trajectories, so the trainee can control his articulating to make the two trajectories overlapped. For on-line communication and ckecking training record the system has the function of video conferencing and tranferring articulatory data.

  • PDF

영어학습자의 영어 치찰음 지각과 발성에 관한 연구 (A Study of Perception and Production of English Sibilants by Korean Learners of English)

  • 구희산
    • 음성과학
    • /
    • 제13권4호
    • /
    • pp.43-50
    • /
    • 2006
  • The aim of this study was to identify pronunciation difficulties of Korean learners of English in their articulation of English sibilants /dg, g, z/. Forty-five syllables were produced five times by twelve college students. Test scores were measured from the score board made by FluSpeak, a speech training software program, which was designed for English pronunciation practice and improvement. Results show that 1) the subjects had lower scores in producing /g/ than /dg/ and /z/ from all positions, and 2) subjects had lower scores in inter-vocalic position than in pre-vocalic position and in post-vocalic position when they produced /dg/, /g/, and /z/. The results suggest that on the whole Korean learners have much difficulty in producing /g/, and they also have more auditory and articulatory problems in intervocalic than in the other positions when they produce these sibilants.

  • PDF

조음 음성 합성기에서 버퍼 재정렬을 이용한 연속음 구현 (Implementation of Continuous Utterance Using Buffer Rearrangement for Articula Synthesizer)

  • 이희승;정명진
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2002년도 하계학술대회 논문집 D
    • /
    • pp.2454-2456
    • /
    • 2002
  • Since articuratory synthesis models the human vocal organs as precise as possible, it is potentially the most desirable method to produce various words and languages. This paper proposes a new type of an articulatory synthesizer using Mermelstein vocal tract model and Kelly-Lochbaum digital filter. Previous researches have assumed that the length of the vocal tract or the number of its cross sections dose not vary while uttering. However, the continuous utterance can not be easily implemented under this assumption. The limitation is overcomed by "Buffer Rearrangement" for dynamic vocal tract in this paper.

  • PDF

Explaining Phonetic Variation of Consonants in Vocalic Context

  • Oh, Eu-Jin
    • 음성과학
    • /
    • 제8권3호
    • /
    • pp.31-41
    • /
    • 2001
  • This paper aims to provide preliminary evidence that (at least part of) phonetic phenomena are not simply automatic or arbitrary, but are explained by the functional guidelines, ease of articulation and maintenance of contrasts. The first study shows that languages with more high vowels (e.g., French) allow larger consonantal deviation from its target than languages with less high vowels (e.g., English). This is interpreted as achieving the economy of articulation to a certain extent in order to avoid otherwise extreme articulatory movement to be made in CV syllables due to strict demand on maintaining vocalic contrasts. The second study shows that Russian plain bilabial consonant allows less amount of undershoot due to the neighboring vowels than does English bilabial consonant. This is probably due to the stricter demand on maintaining the consonantal contrasts, plain vs. palatalized, existing only in Russian.

  • PDF

Algorithm for Concatenating Multiple Phonemic Units for Small Size Korean TTS Using RE-PSOLA Method

  • Bak, Il-Suh;Jo, Cheol-Woo
    • 음성과학
    • /
    • 제10권1호
    • /
    • pp.85-94
    • /
    • 2003
  • In this paper an algorithm to reduce the size of Text-to-Speech database is proposed. The algorithm is based on the characteristics of Korean phonemic units. From the initial database, a reduced phoneme unit set is induced by articulatory similarity of concatenating phonemes. Speech data is read by one female announcer for 1000 phonetically balanced sentences. All the recorded speech is then segmented by phoneticians. Total size of the original speech data is about 640 MB including laryngograph signal. To synthesize wave, RE-PSOLA (Residual-Excited Pitch Synchronous Overlap and Add Method) was used. The voice quality of synthesized speech was compared with original speech in terms of spectrographic informations and objective tests. The quality of the synthesized speech is not much degraded when the size of synthesis DB was reduced from 320 MB to 82 MB.

  • PDF

독일어 감정음성에서 추출한 포먼트의 분석 및 감정인식 시스템과 음성인식 시스템에 대한 음향적 의미 (An Analysis of Formants Extracted from Emotional Speech and Acoustical Implications for the Emotion Recognition System and Speech Recognition System)

  • 이서배
    • 말소리와 음성과학
    • /
    • 제3권1호
    • /
    • pp.45-50
    • /
    • 2011
  • Formant structure of speech associated with five different emotions (anger, fear, happiness, neutral, sadness) was analysed. Acoustic separability of vowels (or emotions) associated with a specific emotion (or vowel) was estimated using F-ratio. According to the results, neutral showed the highest separability of vowels followed by anger, happiness, fear, and sadness in descending order. Vowel /A/ showed the highest separability of emotions followed by /U/, /O/, /I/ and /E/ in descending order. The acoustic results were interpreted and explained in the context of previous articulatory and perceptual studies. Suggestions for the performance improvement of an automatic emotion recognition system and automatic speech recognition system were made.

  • PDF

Effects of gender, age, and individual speakers on articulation rate in Seoul Korean spontaneous speech

  • Kim, Jungsun
    • 말소리와 음성과학
    • /
    • 제10권4호
    • /
    • pp.19-29
    • /
    • 2018
  • The present study investigated whether there are differences in articulation rate by gender, age, and individual speakers in a spontaneous speech corpus produced by 40 Seoul Korean speakers. This study measured their articulation rates using a second-per-syllable metric and a syllable-per-second metric. The findings are as follows. First, in spontaneous Seoul Korean speech, there was a gender difference in articulation rates only in age group 10-19, among whom men tended to speak faster than women. Second, individual speakers showed variability in their rates of articulation. The tendency for some speakers to speak faster than others was variable. Finally, there were metric differences in articulation rate. That is, regarding the coefficients of variation, the values of the second-per-syllable metric were much higher than those for the syllable-per-second metric. The articulation rate for the syllable-per-second metric tended to be more distinct among individual speakers. The present results imply that data gathered in a corpus of Seoul Korean spontaneous speech may reflect speaker-specific differences in articulatory movements.

Diagnosing Vocal Disorders using Cobweb Clustering of the Jitter, Shimmer, and Harmonics-to-Noise Ratio

  • Lee, Keonsoo;Moon, Chanki;Nam, Yunyoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권11호
    • /
    • pp.5541-5554
    • /
    • 2018
  • A voice is one of the most significant non-verbal elements for communication. Disorders in vocal organs, or habitual muscular setting for articulatory cause vocal disorders. Therefore, by analyzing the vocal disorders, it is possible to predicate vocal diseases. In this paper, a method of predicting vocal disorders using the jitter, shimmer, and harmonics-to-noise ratio (HNR) extracted from vocal records is proposed. In order to extract jitter, shimmer, and HNR, one-second's voice signals are recorded in 44.1khz. In an experiment, 151 voice records are collected. The collected data set is clustered using cobweb clustering method. 21 classes with 12 leaves are resulted from the data set. According to the semantics of jitter, shimmer, and HNR, the class whose centroid has lowest jitter and shimmer, and highest HNR becomes the normal vocal group. The risk of vocal disorders can be predicted by measuring the distance and direction between the centroids.

Feature Extraction Based on DBN-SVM for Tone Recognition

  • Chao, Hao;Song, Cheng;Lu, Bao-Yun;Liu, Yong-Li
    • Journal of Information Processing Systems
    • /
    • 제15권1호
    • /
    • pp.91-99
    • /
    • 2019
  • An innovative tone modeling framework based on deep neural networks in tone recognition was proposed in this paper. In the framework, both the prosodic features and the articulatory features were firstly extracted as the raw input data. Then, a 5-layer-deep deep belief network was presented to obtain high-level tone features. Finally, support vector machine was trained to recognize tones. The 863-data corpus had been applied in experiments, and the results show that the proposed method helped improve the recognition accuracy significantly for all tone patterns. Meanwhile, the average tone recognition rate reached 83.03%, which is 8.61% higher than that of the original method.