• Title/Summary/Keyword: vowel quality

Search Result 81, Processing Time 0.024 seconds

An Algorithm on Predicting Syllable Numbers of English Disyllabic Loanwords in Korean (영어 2음절 차용어의 음절수 예측 알고리즘)

  • Cho, Mi-Hui
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.3
    • /
    • pp.264-269
    • /
    • 2008
  • When English disyllabic words are borrowed into the Korean language, the loanwords tend to have extra syllables. The purpose of this paper is to find the syllable increase conditions in loanword adaptation and further to provide an algorithm to predict the syllable numbers of English disyllabic loanwords. There are three syllable augmentation conditions. The presence of diphthongs and the existence of consonant clusters guarantee the increase of the syllable numbers in the English loanwords. Further, the quality of the final consonant (and the preceding vowel) sometimes trigger the increase of the syllable numbers. Based on the conditions, an algorithm composed of 4 rules are proposed in order to predict the number of syllables in English disyllabic loanwords.

Multiple Average Ratings of Auditory Perceptual Analysis for Dysphonia

  • Choi, Seong-Hee;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.165-170
    • /
    • 2009
  • This study was to investigate for comparison between single rating and average ratings from multiple presentations of the same stimulus for measuring the voice quality of dysphonia using 7-point equal-appearing interval (EAI) rating scale. Overall severity of voice quality for 46 /a/ vowel stimuli (23 stimuli from dysphonia, 23 stimuli from control) was rated by 3 experienced speech-language pathologists (averaged 19 years; range = 7 to 40 years). For average ratings, each stimulus was rated five times in random order and averaged from two to five times. Although higher inter-rater reliability was found in average ratings than in single rating, there were no significant differences in rating scores between single and multiple average ratings judged by experienced listeners, suggesting that auditory perceptual ratings judged by well-trained listeners have relatively good agreement with the same stimulus across the judgment. Larger variations in perceptual ratings were observed for moderate voices than for mild or severe voices, even in the average ratings.

  • PDF

Voice quality distinctions of the three-way stop contrast under prosodic strengthening in Korean

  • Jiyoung Jang;Sahyang Kim;Taehong Cho
    • Phonetics and Speech Sciences
    • /
    • v.16 no.1
    • /
    • pp.17-24
    • /
    • 2024
  • The Korean three-way stop contrast (lenis, aspirated, fortis) is currently undergoing a sound change, such that the primary cue distinguishing lenis and aspirated stops is shifting from voice onset time (VOT) to F0. Despite recent discussions of this shift, research on voice quality, traditionally considered an additional cue signaling the contrast, remains sparse. This study investigated the extent to which the associated voice quality [as reflected in the acoustic measurements of H1*-H2*, H1*- A1*, and cepstral peak prominence (CPP)] contributes to the three-way stop contrast, and how the realization is conditioned by prominence- vs. boundary-induced prosodic strengthening amid the ongoing sound change. Results for 12 native Korean speakers indicate that there was a substantial distinction in voice quality among the three stop categories with the breathiness of the vowel being the greatest after the lenis, intermediate after the aspirated, and least after the fortis stops, indicating the role of voice quality in the maintenance of the three-way stop contrast. Furthermore, prosodic strengthening has different effects on the contrast and contributes to the enhancement of the phonological contrast contingent on whether it is induced by prominence or boundary.

The effect of voice quality on speech intelligibility in children with spastic cerebral palsy (경직형 뇌성마비 아동의 음질이 말명료도에 미치는 영향)

  • Jeong, Pil Yeon;Sim, Hyun Sub
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.129-136
    • /
    • 2017
  • This study investigates the effect of voice quality on speech intelligibility and the relationship between voice quality and intelligibility for children with spastic CP. We recruited 36 children with spastic CP (mean age 10.43 year, 17 girls, 19 boys, spastic type 34, mixed 2) from a special school and a rehabilitation hospital. Voice samples for the perceptual analysis of voice quality were extracted from a sustained vowel /a/ and were rated on the GRBAS scales by two experienced speech language pathologists. Ten adult subjects with no hearing problems evaluated speech intelligibility for the 37 words listed in the Assessment of Phonology and Articulation for Children on a 7-point interval scale. The children with spastic CP were divided into three groups according to the rated G scores on the GRBAS scales (G1(n)=10, G2(n)=13, G3(n)=13). Analyses of ANCOVA and Pearson correlation showed that there was a significant difference in speech intelligibility among three groups. There was also a significant correlation in G scale (grade), A scale (asthenia), B scale (breathy) score, and speech intelligibility. These findings suggest that poor speech intelligibility of spastic CP might be related to asthenia and breathiness. Vocal intensity should be increased and vocal functioning should be improved for speech therapy to improve speech intelligibility of the children with spastic CP.

Spectral and Cepstral Analyses of Esophageal Speakers (식도발성화자 음성의 spectral & cepstral 분석)

  • Shim, Hee-Jeong;Jang, Hyo-Ryung;Shin, Hee-Baek;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.47-54
    • /
    • 2014
  • The purpose of this study was to analyze spectral versus cepstral measurements in esophageal speakers. The comparison between the measurements in thirteen male esophageal speakers was compared with the control group of thirteen normal speakers using the sustained vowel /a/. The main results can be summarized as below: (a) the CPP and L/H ratio of the esophageal group were significantly lower than those of the control group (b) the CPP was significantly correlated with the spectral parameters such as jitter, shimmer, NHR and VTI, and (c) the ROC analysis showed that the threshold of 10.25dB for the CPP achieved a good classification for esophageal speakers, with 100% perfect sensitivity and specificity. Thus, it was known that cepstral-based acoustic measures such as CPP, may be more reliable predictors than other spectral-based acoustic measures such as jitter and shimmer. And it was found that cepstral-based acoustic measures were effective in distinguishing esophageal voice quality from normal voice quality. This research will contribute to establishing a baseline related to speech characteristics in voice rehabilitation with laryngectomees.

Two Cases Using the Praat-Based Automatic Voice Analysis Program as an Alternative to CSL (사례 적용 Praat 기반 CSL 대체 자동화 음성분석 프로그램)

  • Kang, Young Ae;Chang, Jae Won;Koo, Bon Seok
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.32 no.2
    • /
    • pp.87-93
    • /
    • 2021
  • There are a number of voice analysis programs around the world. Domestic voice analysis is performed by relying heavily on specific commercial program. We intend to develop coding for voice analysis using Praat and apply it to clinical practice. This study consisted of Experiment 1 and Experiment 2. Experiment 1 was the development of automated voice analysis coding based on Praat. The coding was largely divided into a recording, an analysis, and a storage section. Experiment 2 was applied to the voice analysis of 2 male patients pre- and post-operation with this coding. The analysis parameters of this coding provided 26 parameters for vowel /a/, nine parameters for sentence analysis, and a total of 4 parameters for voice range profile analysis. In two male patients, the pitch and the intensity increased, the voice quality improved, and the sentence length decreased after surgery. The coding was well made, so the output was good in real time. The code is automated as much as possible to block manual errors and increases convenience and efficiency by generating the result sheet in real time.

The Error Pattern Analysis of the HMM-Based Automatic Phoneme Segmentation (HMM기반 자동음소분할기의 음소분할 오류 유형 분석)

  • Kim Min-Je;Lee Jung-Chul;Kim Jong-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.5
    • /
    • pp.213-221
    • /
    • 2006
  • Phone segmentation of speech waveform is especially important for concatenative text to speech synthesis which uses segmented corpora for the construction of synthetic units. because the quality of synthesized speech depends critically on the accuracy of the segmentation. In the beginning. the phone segmentation was manually performed. but it brings the huge effort and the large time delay. HMM-based approaches adopted from automatic speech recognition are most widely used for automatic segmentation in speech synthesis, providing a consistent and accurate phone labeling scheme. Even the HMM-based approach has been successful, it may locate a phone boundary at a different position than expected. In this paper. we categorized adjacent phoneme pairs and analyzed the mismatches between hand-labeled transcriptions and HMM-based labels. Then we described the dominant error patterns that must be improved for the speech synthesis. For the experiment. hand labeled standard Korean speech DB from ETRI was used as a reference DB. Time difference larger than 20ms between hand-labeled phoneme boundary and auto-aligned boundary is treated as an automatic segmentation error. Our experimental results from female speaker revealed that plosive-vowel, affricate-vowel and vowel-liquid pairs showed high accuracies, 99%, 99.5% and 99% respectively. But stop-nasal, stop-liquid and nasal-liquid pairs showed very low accuracies, 45%, 50% and 55%. And these from male speaker revealed similar tendency.

Prevalence of Voice Disorders and Characteristics of Korean Voice Handicap Index in the Elderly (노인 음성장애 출현율 및 음성장애지수 특성)

  • Song, Yun-Kyung
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.151-159
    • /
    • 2012
  • The purpose of this study is to evaluate the prevalence of voice disorders and the Korean voice handicap index in the elderly. For this study, 169 elderly performed two types of questionnaires and vowel /a/ prolongation. Self-reported voice symptoms and the Korean voice handicap index were analyzed and acoustic voice evaluation was performed by MDVP. The results showed that the prevalence of voice disorders in the elderly are significantly higher than that of adults in self-reports. In acoustic evaluation, 32.2% of the male elderly and 40.9% of the female elderly exceeded the thresholds of Jitter (%), Shimmer (%) and NHR. In addition, Korean voice handicap index scores of the female elderly are significantly higher than those of female adults. These findings indicate the high frequency of voice disorders in the elderly and the need to focus on this group. Additional studies on the voice related quality of life for the elderly are needed.

Implementation of Korean TTS System based on Natural Language Processing (자연어 처리 기반 한국어 TTS 시스템 구현)

  • Kim Byeongchang;Lee Gary Geunbae
    • MALSORI
    • /
    • no.46
    • /
    • pp.51-64
    • /
    • 2003
  • In order to produce high quality synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model from texts using natural language processing. Robust preprocessing for non-Korean characters should also be required. In this paper, we analyzed Korean texts using a morphological analyzer, part-of-speech tagger and syntactic chunker. We present a new grapheme-to-phoneme conversion method for Korean using a hybrid method with a phonetic pattern dictionary and CCV (consonant vowel) LTS (letter to sound) rules, for unlimited vocabulary Korean TTS. We constructed a prosody model using a probabilistic method and decision tree-based method. The probabilistic method atone usually suffers from performance degradation due to inherent data sparseness problems. So we adopted tree-based error correction to overcome these training data limitations.

  • PDF

Effect of Glottal Wave Shape on the Vowel Phoneme Synthesis (성문파형이 모음음소합성에 미치는 영향)

  • 안점영;김명기
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.10 no.4
    • /
    • pp.159-167
    • /
    • 1985
  • It was demonstrated that the glottal waves are different depending on a kind of vowels in deriving the glottal waves directly from Korean vowels/a, e, I, o, u/ w, ch are recorded by a male speaker. After resynthesizing vowels with five simulated glottal waves, the effects of glottal wave shape on the speech synthesis were compared with in terms of waveform. Some changes could be seen in the waveforms of the synthetic vowels with the variation of the shape, opening time and closing time, therefore it was confirmed that in the speech sysnthesis, the glottal wave shape is an important factor in the improvement of the speech quality.

  • PDF