• Title/Summary/Keyword: Clear speech

Search Result 115, Processing Time 0.025 seconds

Differences in GRBAS scales and shimmer according to vocal sample types in people with vocal disorders (음성장애와 샘플유형에 따른 GRBAS 측정치 및 shimmer 비교)

  • Shin, Yu-Jeong;Hong, Ki-Hwan;Sim, Hyun-Sub
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.149-155
    • /
    • 2011
  • The purpose of the present study was to identify the differences in GRBAS scales between vocal sample types (sustained vowels and connected speech) for specific laryngeal conditions (vocal nodules, vocal polyps and vocal paralysis) and the relations between GRBAS scale and Shimmer value in each vocal sample type. In this study, the total of 60 voice samples of 30 patients (10 vocal nodules, 10 vocal polyps, 10 vocal paralysis) were examined and MDVP (Multi-dimensional Voice Program) was used to analyze Shimmer value. Three listeners rated two types of samples which were sorted randomly based on GRBAS scale. Three-way ANOVA, one-way ANOVA and paired t-test were used. The outcome of this study was as follow. 1) GRBAS scales varied in vocal sample types. Listeners tended to assess voices as better quality when they listened connected speech rather than sustained vowels. 2) G score of GRBAS and Shimmer were positively correlated with statistical significance. This results show that 1) vocal specialists should consider the sample types in evaluating the severity of voice problem and 2) G score could be a simple and clear method.

  • PDF

Classification of Sasang Constitution Taeumin by Comparative of Speech Signals Analysis (음성 분석 정보값 비교를 통한 사상체질 태음인의 분류)

  • Kim, Bong-Hyun;Lee, Se-Hwan;Cho, Dong-Uk
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.17-24
    • /
    • 2008
  • This paper proposes Sasang constitution classification through speech signals analysis values and comparison. For this, this paper wishes to propose Taeumin classification method of output values signals that comes out speech signal analysis to connect with process classification of Soeumin through skin diagnosis by first step in the whole system configuration to provide for objective index of Sasang constitution. First of all, these characteristic of voices wish to extract phonetic elements that each Sasang constitution groups' clear features. Also, we wish to classify Taeumin through constitution groups' difference and similarity on the basis of results value. Finally, the effectiveness of this method is verified through the experiments.

Gender difference in the sound change of lexical pitch accents of South Kyungsang Korean

  • Lee, Hyunjung
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.123-130
    • /
    • 2015
  • Given a recent finding showing that female speakers of South Kyungsang Korean is undergoing a sound change of the lexical pitch accent, this study tested whether the change is also reflected for male speech. This study compared F0 scaling and timing properties of accent words produced by younger female and male speakers of South Kyungsang Korean. The results indicated clear gender-related differences, showing more distinct acoustic properties across the accent words for male production compared to females. Despite the better distinction, however, younger male speakers showed peak delay where the F0 peaks are located further to the right compared to conservative speakers' production. Therefore, it might be suggested that younger male speakers' accent productions are in between conservative and innovative phonetic forms.

A Study of Morphological Errors in Aphasic Language

  • Kim, Heui-Beom
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.227-236
    • /
    • 1997
  • How do aphasics deal with the inflectional marking occurring in agglutinative languages like Korean? Korean speech repetition, comprehension and production were studied in 3 Broca's aphasic speakers of Korean. As experimental materials, 100 easy sentences were chosen in 1st grade Korean elementary school textbooks about reading writing and listening, and two pictures were made from each sentence. This study examines the use of three kinds of inflectional markings--past tense, nominative case, and accusative case. The analysis focuses on whether each inflectional marking was performed well or not in tasks such as repetition, comprehension and production. In addition, morphological errors concerned with each inflectional marking were analyzed in view of markedness. In general, the aphasic subjects showed a clear preservation of the morphological aspects of their native language. So the view of Broca's aphasics as agrammatical could not be strongly supported. It can be suggested that nominative case and accusative case are marked elements in Korean.

  • PDF

Design of Speech Enhancement U-Net for Embedded Computing (임베디드 연산을 위한 잡음에서 음성추출 U-Net 설계)

  • Kim, Hyun-Don
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.15 no.5
    • /
    • pp.227-234
    • /
    • 2020
  • In this paper, we propose wav-U-Net to improve speech enhancement in heavy noisy environments, and it has implemented three principal techniques. First, as input data, we use 128 modified Mel-scale filter banks which can reduce computational burden instead of 512 frequency bins. Mel-scale aims to mimic the non-linear human ear perception of sound by being more discriminative at lower frequencies and less discriminative at higher frequencies. Therefore, Mel-scale is the suitable feature considering both performance and computing power because our proposed network focuses on speech signals. Second, we add a simple ResNet as pre-processing that helps our proposed network make estimated speech signals clear and suppress high-frequency noises. Finally, the proposed U-Net model shows significant performance regardless of the kinds of noise. Especially, despite using a single channel, we confirmed that it can well deal with non-stationary noises whose frequency properties are dynamically changed, and it is possible to estimate speech signals from noisy speech signals even in extremely noisy environments where noises are much lauder than speech (less than SNR 0dB). The performance on our proposed wav-U-Net was improved by about 200% on SDR and 460% on NSDR compared to the conventional Jansson's wav-U-Net. Also, it was confirmed that the processing time of out wav-U-Net with 128 modified Mel-scale filter banks was about 2.7 times faster than the common wav-U-Net with 512 frequency bins as input values.

Speaker age estimation and acoustic characteristics: According to pitch and speech rate (화자 연령 지각과 음성적 특성: 음높이와 발화 속도를 중심으로)

  • Seo, YoonJeong;Shin, Jiyoung
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.9-18
    • /
    • 2019
  • This study aimed to investigate the correlation between speaker's chronological age (CA) and perceived age (PA) and to specify the effect of pitch and speech rate as acoustic cue on judging age, using perceptual testing and acoustic analysis. Three tasks were conducted to identify the degree of listener's accuracy about age estimation. Three perception tasks were conducted to measure the accuracy of 80 Korean listeners when presented with different types of speech. In all the tasks, participants listened to speech samples and gave their estimate of the speaker's age in figures. It was found that Korean listeners are able to gauge the age of a speaker fairly precisely. CA and mean PA were positively correlated in all three tasks. It is clear that the amount and type of information included in the voice samples affected the accuracy of a listener's judgement. Moreover, the result revealed that listeners make use of acoustic information such as pitch and speech rate to estimate speaker's age.

Phonetic meaning of clarity and turbidity (청탁의 음성학적 의미)

  • Park, Hansang
    • Phonetics and Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.77-89
    • /
    • 2017
  • This study investigates the phonetic meaning of clarity and turbidity(淸濁) that has been used in psychoacoustics, musicology, and linguistics in both the East and the West. With a view to clarifying the phonetic meaning of clarity and turbidity, this study conducts three perception tests. First, 34 subjects were asked to take one of Clear and Turbid by forced choice for 5 pure and complex tones, respectively, ranging from A2 to A6 differing by octave. Second, they were asked to select between the two choices for 25 pure and complex tones, respectively, ranging from A2 to A4 differing by semitone. Third, they were asked to opt for one of the two choices for 8 different vowels of different formant and fundamental frequencies. Results showed that there is a certain range of tone which is perceived as clear, that clarity level increases as fundamental frequency increases, and that pure tones have a higher level of clarity than complex ones, fundamental frequency being equal. Results also showed that vocal tract resonance enhances clarity level on the whole, and that lower vowels have a higher level of clarity than higher ones. This study is significant in that it demonstrates that clarity level is proportional to fundamental frequency and the first formant frequency, all else being equal.

Statistical Patterns in Consonant Cluster Simplification in Seoul Korean: Within-dialect Interspeaker and Intraspeaker Variation

  • Cho, Tae-Hong;Kim, Sa-Hyang
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.33-40
    • /
    • 2009
  • This study examines how young speakers of Seoul Korean produce tri-consonantal clusters /1kt/ and /1pt/ as in palk-ta ('to be bright') and palp-ta ('to step on'). Production data were collected from 20 speakers of Seoul Korean. The results of narrow transcription of the data showed that simplification is not obligatory as some speakers often preserve all three consonants. When simplified, there was a clear asymmetry between /1kt/ and /1pt/. Speakers showed no clear preference for either C1 preservation (C1=/1/) or C2 preservation (C2=/k/ in /1kt/ and /p/ in /1pt/) in production of /1kt/, but in production of /1pt/, strong preference was found for C1-preserved to C2-preserved variant. When compared with production data in Cho (1999), simplification patterns appear to have changed over the past 10 years, in a direction to preserve the first member of the cluster (/1/) more often, especially with /1kt/. There was no substantial between-item variation, indicating that simplification patterns are not lexically specified. Finally, the results suggest that the process of tri-consonantal simplification has not been fully phonologized in the grammar of the language as evident in substantial inter- and intra-speaker variation.

  • PDF

The oriental-western literatural study of Delirious speech and Fading murmuring (섬어(語語)와 정성(鄭聲)에 대한 동서의학적(東西醫學的) 고찰(考察))

  • Choi, Byong Man;Lee, Sang Ryong
    • Journal of Haehwa Medicine
    • /
    • v.9 no.1
    • /
    • pp.745-761
    • /
    • 2000
  • Literatural study for Delirious speech and Fading murmuring, the results were as follows. 1. Delirious speech and Fading murmuring are given at the speech impediment. Derious speech to be out of language's order and slur the end of his words, and Fading murmuring is to repeat in losing conscious. 2. In constrast with Delirious speech and Fading murmuring, Maniac speech is induced by a general term for manic-depressive psychosis. Luoyan is to say in a feeble voice and mumble in a sleeping condition, and Paraphasia and Solioquy are appeared in a clear mental condition. The speech impediment is caused by damages of the nervous system and speech organ, and Yuyancuoluan is appeared in a feverless condition. 3. The symptoms of Delirious speech are to utter ravings and have a loud and heavy voice, and these resemble the delirium which specially has a speech impediment and muddle in the western medical world. The symptoms of Fading murmuring are to speak ambigously, repeatedly, and illogically and so are similar to the Wernicke dysphasia which is caused by a incomprehensible conversation. 4. The causes of Delirious speech are to spread a stomach heat and the lungs pathogenic qi into heart, not to sweat in cold damage, the Three Yang Combination of syndrome, stomach repletion, yang collapse due to excessive sweat, diarrhea, after diarrhea, heat to enter the blood chamer, feces to remain in the stomach, stasis blood to enter the viscera, to carry anger to extremity, and to be constipated. the cause of Fading murmuring is to despair vacuity desertion of vital essence and energy after a serious illness. 5. The causes of delirium are general infection, postoperative states, and metabolism disorders and those of Wernicke dysphasia are disorders of the blood vessel, brain tumors and traumas. 6. Delirious speech is cured with the discrimination of vacuity and repletion. Baitong Tang(白通湯), Chaihu Guizhi Tang(柴胡桂枝湯), Chaihu Jia Longgu Muli Tang(柴胡加龍骨牡蠣湯) are prescribed in case of vacuity, while Chengqi Tang(承氣湯), Baihu Tang(白虎湯), Liangge San(凉膈散) are in case of repletion. Fading murmuring is treated with Xiao Chaihu Tang(小柴胡湯), Fuzi Tang Jiawei(附子湯加味), Shengmai San(生脈散), and Renshen Sanbai Tang(人蔘三白湯). 7. To acupunture Qimen-Xue(期門穴) is required when it is late to prescribe a medical decoction or the hyperactive liver qi attacking the spleen.

  • PDF

Speech Synthesis for the Korean large Vocabulary Through the Waveform Analysis in Time Domains and Evauation of Synthesized Speech Quality (시간영역에서의 파형분석에 의한 무제한 어휘 합성 및 음절 유형별 규칙합성음 음질평가)

  • Kang, Chan-Hee;Chin, Yong-Ohk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1
    • /
    • pp.71-83
    • /
    • 1994
  • This paper deals with the improvement of the synthesized speech quality and naturality in the Korean TTS(Text-to-Speech) system. We had extracted the parameters(table2) such as its amplitude, duration and pitch period in a syllable through the analysis of speech waveforms(table1) in the time domain and synthesized syllables using them. To the frequencies of the Korean pronunciation large vocabulary dictionary we had synthesized speeches selected 229 syllables such as V types are 19, CV types are 80. VC types are 30 and CVC types are 100. According to the 4 Korean syllable types from the data format dictionary(table3) we had tested each 15 syllables with the objective MOS(Mean Opinion Score) evaluation method about the 4 items i.e., intelligibility, clearness, loudness, and naturality after selecting random group without the knowledge of them. As the results of experiments the qualities of them are very clear and we can control the prosodic elements such as durations, accents and pitch periods (fig9, 10, 11, 12).

  • PDF