• Title/Summary/Keyword: Low Vowel

Search Result 105, Processing Time 0.026 seconds

Acoustic Analysis and Auditory-Perceptual Assessment for Diagnosis of Functional Dysphonia (기능성 음성장애의 진단을 위한 음향학적, 청지각적 평가)

  • Kim, Geun-Hyo;Lee, Yeon-Yoo;Bae, In-Ho;Lee, Jae-Seok;Lee, Chang-Yoon;Park, Hee-June;Lee, Byung-Joo;Kwon, Soon-Bok
    • Journal of Clinical Otolaryngology Head and Neck Surgery
    • /
    • v.29 no.2
    • /
    • pp.212-222
    • /
    • 2018
  • Background and Objectives : The purpose of this study was to compare the measured values of acoustic and auditory perceptual assessments between normal and functional dysphonia (FD) groups. Materials and Methods : 102 subjects with FD and 59 normal voice groups were participated in this study. Mid-vowel portion of the sustained vowel /a/ and two sentences of 'Sanchaek' were edited, concatenated, and analyzed by Praat script. And then auditory-perceptual (AP) rating was completed by three listeners. Results : The FD group showed higher acoustic voice quality index version 2.02 and version 3.01 (AVQIv2 and AVQIv3), slope, Hammarberg index (HAM), grade (G) and overall severity (OS), values than normal group. Additionally, smoothed cepstral peak prominence in Praat (PraatCPPS), tilt, low-to high spectral band energies (L/H ratio), long-term average spectrum (LTAS) in FD group were lower than normal voice group. And the correlation among measured values ranged from -0.250 to 0.960. In ROC curve analysis, cutoff values of AVQIv2, AVQIv3, PraatCPPS, slope, tilt, L/H ratio, HAM, and LTAS were 3.270, 2.013, 13.838, -22.286, -9.754, 369.043, 27.912, and 34.523, respectively, and the AUC of each analysis was over .890 in AVQIv2, AVQIv3, and PraatCPPS, over 0.731 in HAM, tilt, and slope, over 0.605 in LTAS and L/H ratio. Conclusions : In conclusion, AVQI and CPPS showed the highest predictive power for distinguishing between normal and FD groups. Acoustic analyses and AP rating as noninvasive examination can reinforce the screening capability of FD and help to establish efficient diagnosis and treatment process plan for FD.

A study on the clinical utility of voiced sentences in acoustic analysis for pathological voice evaluation (장애음성의 음향학적 분석에서 유성음 문장의 임상적 유용성에 관한 연구)

  • Ji-sung Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.298-303
    • /
    • 2023
  • This study aimed to investigate the clinical utility of voiced sentence tasks for voice evaluation. To this end, we analyzed the correlation between perturbation-based acoustic measurements [jitter percent (jitter), shimmer percent (shimmer), Noise to Harmonic Ratio (NHR)] using sustained vowel phonation, and cepstrum-based acoustic measurements [Cepstral Peak Prominence (CPP), Low/High spectral ratio (L/H ratio)] using voiced sentences. As a result of analyzing data collected from 65 patients with voice disorders, there was a significant correlation between the CPP and jitter (r = -.624, p = .000), shimmer (r = -.530, p = .000), NHR (r = -.469, p = .000).This suggests that the cepstrum measurement of voiced sentences can be used as an alternative to the analysis limitations of the pathological voice such as not possible perturbation-based acoustic measurement, and result difference according to the analysis section.

Speech Stimuli on the Diagnostic Evaluation of Speech with Cleft Lip and Palate : Clinical Use and Literature Review (구개열 환자 말 평가 시 검사어에 대한 고찰 : 임상현장의 말 평가 어음자료와 문헌적 고찰을 중심으로)

  • Choi, Seong-Hee;Choi, Jae-Nam;Nam, Do-Hyun;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.16 no.1
    • /
    • pp.33-48
    • /
    • 2005
  • Differential diagnosis of articulation and resonance problems in the cleft lip and palate speech is required for evaluating various factors contribute to speech problems such as VPI, dental occlusion, palatal fistulae, learning. However, validity of speech stimuli is current issue to evaluate accurately each problem in cleft speech. This study was conducted to investigate speech stimuli using in the clinical setting and review the literatures and articles published 1990 to 2005 for helping develop standardized speech samples. The results were recommendation to evaluate properly velopharyngeal function when conducting a diagnostic evaluation as follows : 1) In identification hypernasality, the speech stimuli should be included low pressure consonants to eliminate effects of nasal emission, compensatory articulation. 2) Speech stimuli should be consist of visual, front sounds to eliminate compensatory articulation and to stimulate easily. 3) Regarding early diagnosis and treatment, speech stimuli need to develop for infants and preschooler. 4) Stimulus length on nasalance scores should be at least 6 syllables. 5) In phonetic context on nasalance scores, /i/ vowel should be take into consideration excluding paragraph. 6) Connected speech stimuli should be developed for evaluating intelligibility and VP function.

  • PDF

Sentence design for speech recognition database

  • Zu Yiqing
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.472-472
    • /
    • 1996
  • The material of database for speech recognition should include phonetic phenomena as much as possible. At the same time, such material should be phonetically compact with low redundancy[1, 2]. The phonetic phenomena in continuous speech is the key problem in speech recognition. This paper describes the processing of a set of sentences collected from the database of 1993 and 1994 "People's Daily"(Chinese newspaper) which consist of news, politics, economics, arts, sports etc.. In those sentences, both phonetic phenometla and sentence patterns are included. In continuous speech, phonemes always appear in the form of allophones which result in the co-articulary effects. The task of designing a speech database should be concerned with both intra-syllabic and inter-syllabic allophone structures. In our experiments, there are 404 syllables, 415 inter-syllabic diphones, 3050 merged inter-syllabic triphones and 2161 merged final-initial structures in read speech. Statistics on the database from "People's Daily" gives and evaluation to all of the possible phonetic structures. In this sentence set, we first consider the phonetic balances among syllables, inter-syllabic diphones, inter-syllabic triphones and semi-syllables with their junctures. The syllabic balances ensure the intra-syllabic phenomena such as phonemes, initial/final and consonant/vowel. the rest describes the inter-syllabic jucture. The 1560 sentences consist of 96% syllables without tones(the absent syllables are only used in spoken language), 100% inter-syllabic diphones, 67% inter-syllabic triphones(87% of which appears in Peoples' Daily). There are rougWy 17 kinds of sentence patterns which appear in our sentence set. By taking the transitions between syllables into account, the Chinese speech recognition systems have gotten significantly high recognition rates[3, 4]. The following figure shows the process of collecting sentences. [people's Daily Database] -> [segmentation of sentences] -> [segmentation of word group] -> [translate the text in to Pin Yin] -> [statistic phonetic phenomena & select useful paragraph] -> [modify the selected sentences by hand] -> [phonetic compact sentence set]

  • PDF

The Aquisition and Description of Voiceless Stops of Spanish and English

  • Marie Fellbaum
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.274-274
    • /
    • 1996
  • This presents the preliminary results from work in progress of a paired study of the acquisition of voiceless stops by Spanish speakers learning English, and American English speakers learning Spanish. For this study the hypothesis was that the American speakers would have no difficulty suppressing the aspiration in Spanish unaspirated stops; the Spanish speakers would have difficulty acquiring the aspiration necessary for English voiceless stops, according to Eckman's Markedness Differential Hypothesis. The null hypothesis was proved. All subjects were given the same set of disyllabic real words of English and Spanish in carrier phrases. The tokens analyzed in this report are limited to word-initial voiceless stops, followed by a low back vowel in stressed syllables. Tokens were randomized and then arranged in a list with the words appearing three separate times. Aspiration was measured from the burst to the onset of voicing(VOT). Both the first language (Ll) tokens and second language (L2) tokens were compared for each speaker and between the two groups of language speakers. Results indicate that the Spanish speakers, as a group, were able to reach the accepted target language VOT of English, but English speakers were not able to reach the accepted range for Spanish, in spite of statistically significant changes of p<.OOl by speakers in both groups of learners. A closer analysis of the speech samples revealed wide variability within the speech of native speakers of English. Not only is variability in English due to the wide range of VOT (120 msecs. for English labials, for example) but individual speakers showed different patterns. These results are revealing for the demands requied in experimental designs and the number of speakers and tokens requied for an adequate description of different languages. In addition, a simple report of means will not distinguish the speakers and the respective language learning situation; measurements must also include the RANGE of acceptability of VOT for phonetic segments. This has immediate consequences for the learning and teaching of foreign languages involving aspirated stops. In addition, the labelling of spoken language in speech technology is shown to be inadequate without a fuller mathematical description.

  • PDF

A Robust Pattern-based Feature Extraction Method for Sentiment Categorization of Korean Customer Reviews (강건한 한국어 상품평의 감정 분류를 위한 패턴 기반 자질 추출 방법)

  • Shin, Jun-Soo;Kim, Hark-Soo
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.12
    • /
    • pp.946-950
    • /
    • 2010
  • Many sentiment categorization systems based on machine learning methods use morphological analyzers in order to extract linguistic features from sentences. However, the morphological analyzers do not generally perform well in a customer review domain because online customer reviews include many spacing errors and spelling errors. These low performances of the underlying systems lead to performance decreases of the sentiment categorization systems. To resolve this problem, we propose a feature extraction method based on simple longest matching of Eojeol (a Korean spacing unit) and phoneme patterns. The two kinds of patterns are automatically constructed from a large amount of POS (part-of-speech) tagged corpus. Eojeol patterns consist of Eojeols including content words such as nouns and verbs. Phoneme patterns consist of leading consonant and vowel pairs of predicate words such as verbs and adjectives because spelling errors seldom occur in leading consonants and vowels. To evaluate the proposed method, we implemented a sentiment categorization system using a SVM (Support Vector Machine) as a machine learner. In the experiment with Korean customer reviews, the sentiment categorization system using the proposed method outperformed that using a morphological analyzer as a feature extractor.

$F_2$ Formant Frequency Characteristics of the Aging Male and Female Speakers (한국어 모음에서 연령증가에 따른 제2음형대의 변화양상)

  • 김찬우;차흥억;장일환;김선태;오승철;석윤식;이영숙
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.10 no.2
    • /
    • pp.119-123
    • /
    • 1999
  • Background and Objectives : Conditions such as muscle atrophy, stretching of strap muscles, and continued craniofacial growth factors have been cited as contributing to the changes observed in the vocal tract structure and function in elderly speakers. The purpose of the present study is to compare F$_1$ and F$_2$ frequency levels in elderly and young adult male and female speakers producing a series of vowels ranging from high-front to low-back placement. Material and Methods : The subjects were two groups of young adults(10 males, 10 females, mean age 21 years old range 19-24 years) and two groups of elderly speakers(10 males, 10 females, mean age 67 years : range 60-84 years). Each subject participated in speech pathologist to be a speaker of unimpared standard Korean. The headphone was positioned 2 cm from the speakers lips. Each speaker sustained the five vowels for 5 s. Formant frequency measures were obtained from an analysis of linear predictive coding in CSL model 4300B(Kay co). Results : Repeated measure AVOVA procedures were completed on the $F_1$ and $F_2$ data for the male and female speakers. $F_2$ formant frequency levels were proven to be significantly lower fir elderly speakers. Conclusions : We presume $F_2$ vocal cavity(from the point of tongue constriction to lip) lengthening in elderly speakers. The research designed to observe dynamic speech production more directly will be needed.

  • PDF

Normalized gestural overlap measures and spatial properties of lingual movements in Korean non-assimilating contexts

  • Son, Minjung
    • Phonetics and Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.31-38
    • /
    • 2019
  • The current electromagnetic articulography study analyzes several articulatory measures and examines whether, and if so, how they are interconnected, with a focus on cluster types and an additional consideration of speech rates and morphosyntactic contexts. Using articulatory data on non-assimilating contexts from three Seoul-Korean speakers, we examine how speaker-dependent gestural overlap between C1 and C2 in a low vowel context (/a/-to-/a/) and their resulting intergestural coordination are realized. Examining three C1C2 sequences (/k(#)t/, /k(#)p/, and /p(#)t/), we found that three normalized gestural overlap measures (movement onset lag, constriction onset lag, and constriction plateau lag) were correlated with one another for all speakers. Limiting the scope of analysis to C1 velar stop (/k(#)t/ and /k(#)p/), the results are recapitulated as follows. First, for two speakers (K1 and K3), i) longer normalized constriction plateau lags (i.e., less gestural overlap) were observed in the pre-/t/ context, compared to the pre-/p/ (/k(#)t/>/k(#)p/), ii) the tongue dorsum at the constriction offset of C1 in the pre-/t/ contexts was more anterior, and iii) these two variables are correlated. Second, the three speakers consistently showed greater horizontal distance between the vertical tongue dorsum and the vertical tongue tip position in /k(#)t/ sequences when it was measured at the time of constriction onset of C2 (/k(#)t/>/k(#)p/): the tongue tip completed its constriction onset by extending further forward in the pre-/t/ contexts than the uncontrolled tongue tip articulator in the pre-/p/ contexts (/k(#)t/>/k(#)p/). Finally, most speakers demonstrated less variability in the horizontal distance of the lingual-lingual sequences, which were taken as the active articulators (/k(#)t/=/k(#)p/ for K1; /k(#)t/

Acoustic characteristics of speech-language pathologists related to their subjective vocal fatigue (언어재활사의 주관적 음성피로도와 관련된 음향적 특성)

  • Jeon, Hyewon;Kim, Jiyoun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.14 no.3
    • /
    • pp.87-101
    • /
    • 2022
  • In addition to administering a questionnaire (J-survey), which questions individuals on subjective vocal fatigue, voice samples were collected before and after speech-language pathology sessions from 50 female speech-language pathologists in their 20s and 30s in the Daejeon and Chungnam areas. We identified significant differences in Korean Vocal Fatigue Index scores between the fatigue and non-fatigue groups, with the most prominent differences in sections one and two. Regarding acoustic phonetic characteristics, both groups showed a pattern in which low-frequency band energy was relatively low, and high-frequency band energy was increased after the treatment sessions. This trend was well reflected in the low-to-high ratio of vowels, slope LTAS, energy in the third formant, and energy in the 4,000-8,000 Hz range. A difference between the groups was observed only in the vowel energy of the low-frequency band (0-4,000 Hz) before treatment, with the non-fatigue group having a higher value than the fatigue group. This characteristic could be interpreted as a result of voice abuse and higher muscle tonus caused by long-term voice work. The perturbation parameter and shimmer local was lowered in the non-fatigue group after treatment, and the noise-to-harmonics ratio (NHR) was lowered in both groups following treatment. The decrease in NHR and the fall of shimmer local could be attributed to vocal cord hypertension, but it could be concluded that the effective voice use of speech-language pathologists also contributed to this effect, especially in the non-fatigue group. In the case of the non-fatigue group, the rhamonics-to-noise ratio increased significantly after treatment, indicating that the harmonic structure was more stable after treatment.

Effects of low-dose topiramate on language function in children with migraine

  • Han, Seung-A;Yang, Eu Jeen;Kong, Younghwa;Joo, Chan-Uhng;Kim, Sun Jun
    • Clinical and Experimental Pediatrics
    • /
    • v.60 no.7
    • /
    • pp.227-231
    • /
    • 2017
  • Purpose: This study aimed to verify the safety of low-dose topiramate on language development in pediatric patients with migraine. Methods: Thirty newly diagnosed pediatric patients with migraine who needed topiramate were enrolled and assessed twice with standard language tests, including the Test of Language Problem Solving Abilities (TOPs), Receptive and Expressive Vocabulary Test, Urimal Test of Articulation and Phonology, and computerized speech laboratory analysis. Data were collected before treatment, and topiramate as monotherapy was sustained for at least 3 months. The mean follow-up period was $4.3{\pm}2.7months$. The mean topiramate dosage was 0.9 mg/kg/day. Results: The patient's mean age was $144.1{\pm}42.3months$ (male-to-female ratio, 9:21). The values of all the language parameters of the TOPs were not changed significantly after the topiramate treatment as follows: Determine cause, from $15.0{\pm}4.4$ to $15.4{\pm}4.8$ (P>0.05); making inference, from $17.6{\pm}5.6$ to $17.5{\pm}6.6$ (P>0.05); predicting, from $11.5{\pm}4.5$ to $12.3{\pm}4.0$ (P>0.05); and total TOPs score, from $44.1{\pm}13.4$ to $45.3{\pm}13.6$ (P>0.05). The total mean length of utterance in words during the test decreased from $44.1{\pm}13.4$ to $45.3{\pm}13.6$ (P<0.05). The Receptive and Expressive Vocabulary Test results decreased from $97.7{\pm}22.1$ to $96.3{\pm}19.9months$, and from $81.8{\pm}23.4$ to $82.3{\pm}25.4months$, respectively (P>0.05). In the articulation and phonology validation in both groups, speech pitch and energy were not significant, and all the vowel test results showed no other significant values. Conclusion: No significant difference was found in the language-speaking ability between the patients; however, the number of vocabularies used decreased. Therefore, topiramate should be used cautiously for children with migraine.