• 제목/요약/키워드: clear speech

검색결과 115건 처리시간 0.027초

음성장애와 샘플유형에 따른 GRBAS 측정치 및 shimmer 비교 (Differences in GRBAS scales and shimmer according to vocal sample types in people with vocal disorders)

  • 신유정;홍기환;심현섭
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.149-155
    • /
    • 2011
  • The purpose of the present study was to identify the differences in GRBAS scales between vocal sample types (sustained vowels and connected speech) for specific laryngeal conditions (vocal nodules, vocal polyps and vocal paralysis) and the relations between GRBAS scale and Shimmer value in each vocal sample type. In this study, the total of 60 voice samples of 30 patients (10 vocal nodules, 10 vocal polyps, 10 vocal paralysis) were examined and MDVP (Multi-dimensional Voice Program) was used to analyze Shimmer value. Three listeners rated two types of samples which were sorted randomly based on GRBAS scale. Three-way ANOVA, one-way ANOVA and paired t-test were used. The outcome of this study was as follow. 1) GRBAS scales varied in vocal sample types. Listeners tended to assess voices as better quality when they listened connected speech rather than sustained vowels. 2) G score of GRBAS and Shimmer were positively correlated with statistical significance. This results show that 1) vocal specialists should consider the sample types in evaluating the severity of voice problem and 2) G score could be a simple and clear method.

  • PDF

음성 분석 정보값 비교를 통한 사상체질 태음인의 분류 (Classification of Sasang Constitution Taeumin by Comparative of Speech Signals Analysis)

  • 김봉현;이세환;조동욱
    • 정보처리학회논문지B
    • /
    • 제15B권1호
    • /
    • pp.17-24
    • /
    • 2008
  • 본 논문에서는 사상 체질 분류를 음성 분석 정보값의 비교, 분석을 통해 제안하고자 한다. 이를 위해 본 논문에서는 사상체질의 객관적 지표를 마련하기 위한 전체 시스템 구성 중 1차 단계로써 피부 진단을 통한 소음인의 분류 과정과 연계하여 음성 신호 분석에서 발생하는 정보의 출력값에 의해 태음인을 분류하는 방법을 제안하고자 한다. 우선 각 사상 체질별로 뚜렷한 특징을 보유하고 있는 집단군을 구성하고 이들의 음성 특성을 분류하여 음성학적 요소를 추출하고자 한다. 또한 출력된 결과값을 토대로 체질 집단별 차이점과 유사점을 통하여 태음인을 분류하고자 한다. 끝으로 실험에 의해 제안한 방법의 유용성을 입증하고자 한다.

Gender difference in the sound change of lexical pitch accents of South Kyungsang Korean

  • Lee, Hyunjung
    • 말소리와 음성과학
    • /
    • 제7권4호
    • /
    • pp.123-130
    • /
    • 2015
  • Given a recent finding showing that female speakers of South Kyungsang Korean is undergoing a sound change of the lexical pitch accent, this study tested whether the change is also reflected for male speech. This study compared F0 scaling and timing properties of accent words produced by younger female and male speakers of South Kyungsang Korean. The results indicated clear gender-related differences, showing more distinct acoustic properties across the accent words for male production compared to females. Despite the better distinction, however, younger male speakers showed peak delay where the F0 peaks are located further to the right compared to conservative speakers' production. Therefore, it might be suggested that younger male speakers' accent productions are in between conservative and innovative phonetic forms.

A Study of Morphological Errors in Aphasic Language

  • Kim, Heui-Beom
    • 음성과학
    • /
    • 제1권
    • /
    • pp.227-236
    • /
    • 1997
  • How do aphasics deal with the inflectional marking occurring in agglutinative languages like Korean? Korean speech repetition, comprehension and production were studied in 3 Broca's aphasic speakers of Korean. As experimental materials, 100 easy sentences were chosen in 1st grade Korean elementary school textbooks about reading writing and listening, and two pictures were made from each sentence. This study examines the use of three kinds of inflectional markings--past tense, nominative case, and accusative case. The analysis focuses on whether each inflectional marking was performed well or not in tasks such as repetition, comprehension and production. In addition, morphological errors concerned with each inflectional marking were analyzed in view of markedness. In general, the aphasic subjects showed a clear preservation of the morphological aspects of their native language. So the view of Broca's aphasics as agrammatical could not be strongly supported. It can be suggested that nominative case and accusative case are marked elements in Korean.

  • PDF

임베디드 연산을 위한 잡음에서 음성추출 U-Net 설계 (Design of Speech Enhancement U-Net for Embedded Computing)

  • 김현돈
    • 대한임베디드공학회논문지
    • /
    • 제15권5호
    • /
    • pp.227-234
    • /
    • 2020
  • In this paper, we propose wav-U-Net to improve speech enhancement in heavy noisy environments, and it has implemented three principal techniques. First, as input data, we use 128 modified Mel-scale filter banks which can reduce computational burden instead of 512 frequency bins. Mel-scale aims to mimic the non-linear human ear perception of sound by being more discriminative at lower frequencies and less discriminative at higher frequencies. Therefore, Mel-scale is the suitable feature considering both performance and computing power because our proposed network focuses on speech signals. Second, we add a simple ResNet as pre-processing that helps our proposed network make estimated speech signals clear and suppress high-frequency noises. Finally, the proposed U-Net model shows significant performance regardless of the kinds of noise. Especially, despite using a single channel, we confirmed that it can well deal with non-stationary noises whose frequency properties are dynamically changed, and it is possible to estimate speech signals from noisy speech signals even in extremely noisy environments where noises are much lauder than speech (less than SNR 0dB). The performance on our proposed wav-U-Net was improved by about 200% on SDR and 460% on NSDR compared to the conventional Jansson's wav-U-Net. Also, it was confirmed that the processing time of out wav-U-Net with 128 modified Mel-scale filter banks was about 2.7 times faster than the common wav-U-Net with 512 frequency bins as input values.

화자 연령 지각과 음성적 특성: 음높이와 발화 속도를 중심으로 (Speaker age estimation and acoustic characteristics: According to pitch and speech rate)

  • 서윤정;신지영
    • 말소리와 음성과학
    • /
    • 제11권4호
    • /
    • pp.9-18
    • /
    • 2019
  • 본고는 한국인 피험자를 대상으로 지각 실험을 진행하여 화자의 실제 연령(Chronological age)과 지각 연령(Perceived age) 간의 상관관계를 살피고, 한국인 피험자가 얼마나 정확하게 익명의 화자의 연령을 지각할 수 있는지를 밝히고자 한다. 또한, 이러한 연령 지각에 음성적 단서가 되는 음높이와 발화 속도와 지각 연령 간의 영향 관계를 검토하고자 한다. 이를 위해, 성인 80명을 대상으로 3가지 과제로 구성된 지각 실험을 진행하였다. 실험 자극은 표준어 화자 40명에게서 추출되었으며, 자유 발화, 낭독 발화, 모음 연장 발성으로 구성되었다. 각 실험은 10초 내외의 음성을 듣고 연령을 구체적인 숫자로 답하는 방식으로 진행되었다. 분석 결과, 한국인 피험자들은 상당히 높은 판단 정확도를 보였으며, 모음 연장 발성을 들었을 때보다 자유 발화와 낭독 발화를 들었을 때 화자의 연령을 더욱 정확하게 짐작하였다. 이러한 결과는 음성이 포함하고 있는 정보량의 차이에 기인한 것으로 보인다. 또한, 음성 분석을 수행한 결과 피험자들은 화자의 음높이와 발화 속도를 참고하여 화자의 연령을 추정하는 것으로 나타났으며, 음높이보다는 발화 속도가 연령 지각에 더 적극적으로 기여한 것으로 나타났다.

청탁의 음성학적 의미 (Phonetic meaning of clarity and turbidity)

  • 박한상
    • 말소리와 음성과학
    • /
    • 제9권4호
    • /
    • pp.77-89
    • /
    • 2017
  • This study investigates the phonetic meaning of clarity and turbidity(淸濁) that has been used in psychoacoustics, musicology, and linguistics in both the East and the West. With a view to clarifying the phonetic meaning of clarity and turbidity, this study conducts three perception tests. First, 34 subjects were asked to take one of Clear and Turbid by forced choice for 5 pure and complex tones, respectively, ranging from A2 to A6 differing by octave. Second, they were asked to select between the two choices for 25 pure and complex tones, respectively, ranging from A2 to A4 differing by semitone. Third, they were asked to opt for one of the two choices for 8 different vowels of different formant and fundamental frequencies. Results showed that there is a certain range of tone which is perceived as clear, that clarity level increases as fundamental frequency increases, and that pure tones have a higher level of clarity than complex ones, fundamental frequency being equal. Results also showed that vocal tract resonance enhances clarity level on the whole, and that lower vowels have a higher level of clarity than higher ones. This study is significant in that it demonstrates that clarity level is proportional to fundamental frequency and the first formant frequency, all else being equal.

Statistical Patterns in Consonant Cluster Simplification in Seoul Korean: Within-dialect Interspeaker and Intraspeaker Variation

  • Cho, Tae-Hong;Kim, Sa-Hyang
    • 말소리와 음성과학
    • /
    • 제1권1호
    • /
    • pp.33-40
    • /
    • 2009
  • This study examines how young speakers of Seoul Korean produce tri-consonantal clusters /1kt/ and /1pt/ as in palk-ta ('to be bright') and palp-ta ('to step on'). Production data were collected from 20 speakers of Seoul Korean. The results of narrow transcription of the data showed that simplification is not obligatory as some speakers often preserve all three consonants. When simplified, there was a clear asymmetry between /1kt/ and /1pt/. Speakers showed no clear preference for either C1 preservation (C1=/1/) or C2 preservation (C2=/k/ in /1kt/ and /p/ in /1pt/) in production of /1kt/, but in production of /1pt/, strong preference was found for C1-preserved to C2-preserved variant. When compared with production data in Cho (1999), simplification patterns appear to have changed over the past 10 years, in a direction to preserve the first member of the cluster (/1/) more often, especially with /1kt/. There was no substantial between-item variation, indicating that simplification patterns are not lexically specified. Finally, the results suggest that the process of tri-consonantal simplification has not been fully phonologized in the grammar of the language as evident in substantial inter- and intra-speaker variation.

  • PDF

섬어(語語)와 정성(鄭聲)에 대한 동서의학적(東西醫學的) 고찰(考察) (The oriental-western literatural study of Delirious speech and Fading murmuring)

  • 최병만;이상용
    • 혜화의학회지
    • /
    • 제9권1호
    • /
    • pp.745-761
    • /
    • 2000
  • Literatural study for Delirious speech and Fading murmuring, the results were as follows. 1. Delirious speech and Fading murmuring are given at the speech impediment. Derious speech to be out of language's order and slur the end of his words, and Fading murmuring is to repeat in losing conscious. 2. In constrast with Delirious speech and Fading murmuring, Maniac speech is induced by a general term for manic-depressive psychosis. Luoyan is to say in a feeble voice and mumble in a sleeping condition, and Paraphasia and Solioquy are appeared in a clear mental condition. The speech impediment is caused by damages of the nervous system and speech organ, and Yuyancuoluan is appeared in a feverless condition. 3. The symptoms of Delirious speech are to utter ravings and have a loud and heavy voice, and these resemble the delirium which specially has a speech impediment and muddle in the western medical world. The symptoms of Fading murmuring are to speak ambigously, repeatedly, and illogically and so are similar to the Wernicke dysphasia which is caused by a incomprehensible conversation. 4. The causes of Delirious speech are to spread a stomach heat and the lungs pathogenic qi into heart, not to sweat in cold damage, the Three Yang Combination of syndrome, stomach repletion, yang collapse due to excessive sweat, diarrhea, after diarrhea, heat to enter the blood chamer, feces to remain in the stomach, stasis blood to enter the viscera, to carry anger to extremity, and to be constipated. the cause of Fading murmuring is to despair vacuity desertion of vital essence and energy after a serious illness. 5. The causes of delirium are general infection, postoperative states, and metabolism disorders and those of Wernicke dysphasia are disorders of the blood vessel, brain tumors and traumas. 6. Delirious speech is cured with the discrimination of vacuity and repletion. Baitong Tang(白通湯), Chaihu Guizhi Tang(柴胡桂枝湯), Chaihu Jia Longgu Muli Tang(柴胡加龍骨牡蠣湯) are prescribed in case of vacuity, while Chengqi Tang(承氣湯), Baihu Tang(白虎湯), Liangge San(凉膈散) are in case of repletion. Fading murmuring is treated with Xiao Chaihu Tang(小柴胡湯), Fuzi Tang Jiawei(附子湯加味), Shengmai San(生脈散), and Renshen Sanbai Tang(人蔘三白湯). 7. To acupunture Qimen-Xue(期門穴) is required when it is late to prescribe a medical decoction or the hyperactive liver qi attacking the spleen.

  • PDF

시간영역에서의 파형분석에 의한 무제한 어휘 합성 및 음절 유형별 규칙합성음 음질평가 (Speech Synthesis for the Korean large Vocabulary Through the Waveform Analysis in Time Domains and Evauation of Synthesized Speech Quality)

  • 강찬희;진용옥
    • 한국음향학회지
    • /
    • 제13권1호
    • /
    • pp.71-83
    • /
    • 1994
  • 본 논문은 한국어 문어면환(TTS : Text-to-Speech) 시스템내에서의 음성합성시 음질 및 자연성 개선을 위한 연구 결과이다. 합성방법으로는 단음절단위의 파형을 시간영역에서 분석(표1)하여 규칙합성에 필요한 매개변수(표2)를 추출하여 규칙합성시켰다. 실험에 사용된 음절은 한국어 발음 대사전의 빈도순위에 따라 V형 19개, CV형 80개, VC형 30개, CVC형 100개등 총 229음절을 선정하여 규칙합성시켰다. 규칙합성음의 평가방법으로는 229개의 규칙합성음중 음절 유형별로 15개씩 무작위로 추출한 합성음을 사전지식이 없는 임의의 그룹을 선정하여 이해도, 명료도, 잡음감, 자연성등 4가지 항목에 대하여 주관적인 오피니온 평가를 수행하였다. 실험결과, 합성음의 음질은 대단히 명료한 수준이었으며, 운율요소의 제어결과는 지속시간(장단)과 악센트(강약)의 제어(그림 9, 그림 10)가 가능하였으며, 피치주기(억양)의 제어도 Lagrange 보간법을 사용함으로써 가능하였다(그림 11, 그림 12).

  • PDF