• Title/Summary/Keyword: syllables

Search Result 370, Processing Time 0.024 seconds

Speech Synthesis for the Korean large Vocabulary Through the Waveform Analysis in Time Domains and Evauation of Synthesized Speech Quality (시간영역에서의 파형분석에 의한 무제한 어휘 합성 및 음절 유형별 규칙합성음 음질평가)

  • Kang, Chan-Hee;Chin, Yong-Ohk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.13 no.1
    • /
    • pp.71-83
    • /
    • 1994
  • This paper deals with the improvement of the synthesized speech quality and naturality in the Korean TTS(Text-to-Speech) system. We had extracted the parameters(table2) such as its amplitude, duration and pitch period in a syllable through the analysis of speech waveforms(table1) in the time domain and synthesized syllables using them. To the frequencies of the Korean pronunciation large vocabulary dictionary we had synthesized speeches selected 229 syllables such as V types are 19, CV types are 80. VC types are 30 and CVC types are 100. According to the 4 Korean syllable types from the data format dictionary(table3) we had tested each 15 syllables with the objective MOS(Mean Opinion Score) evaluation method about the 4 items i.e., intelligibility, clearness, loudness, and naturality after selecting random group without the knowledge of them. As the results of experiments the qualities of them are very clear and we can control the prosodic elements such as durations, accents and pitch periods (fig9, 10, 11, 12).

  • PDF

Characteristics of Speech Breathing in de novo Idiopathic Parkinson's Disease during Passage Reading Tasks (De novo 특발성 파킨슨병 환자의 문단 읽기 과제에서의 호흡 특성)

  • Kim, Byung-Me;Sohn, Young-Ho;Baek, Seung-Jae;Lee, Phil-Hyu;Nam, Chung-Mo;Lee, Ji-Eun;Choi, Yae-Lin
    • Phonetics and Speech Sciences
    • /
    • v.3 no.1
    • /
    • pp.103-110
    • /
    • 2011
  • Idiopathic Parkinson's Disease patients' speech is hypokinetic dysarthria and their speech is possibly the consequence of impaired respiratory support. The purpose of this study was focused on the respiratory characteristics of speech breathing in de novo IPD who were not given prior vocal or anti-Parkinson treatment. A total of 40 subjects participated in the study: 20 de novo IPD patients between the ages of 50 and 80, and 20 normal subjects with similar age, height, and weight matches. Forced Expiratory Vital Capacity (FVC), Forced Expiratory Volume in 1 sec (FEV1) and $FEV_1$ as a percentage of FVC (FEV1/FVC) was measured with a PC-based spirometer (Cosmed). In addition, Maximum Phonation Time (MPT), Mean Airflow Rate (MFR), Subglottal Pressure (Psub) and the number of syllables produced per breath were measured with a Phonatory Aerodynamic System (Kay PENTAX). All subjects were asked to read a standardized Korean paragraph and the following measurements were obtained from the task. Results indicated no statistically significant differences in respiratory function (FEV1/FVC%) and aerodynamic function between the two groups, but the number of syllables per breath was significantly lower in the IPD patient group than in the normal group and it could be predicted by FVC and MFR. Therefore, the study shows that the MFR from the lungs during speech in de novo IPD patients is used inefficiently.

  • PDF

The study of diadochokinetic (DDK) rate and accuracy in typically developing children (취학 전 정상구어발달 아동의 조음교대운동 특성)

  • Sehr, Kyoung-Hee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.1
    • /
    • pp.321-327
    • /
    • 2013
  • This paper aimed to find out the differences of DDK performances of 37 normally developing children in the range of 4-6 years. DDK tasks included with the Consonant-Vowel(CV) syllables and with the Vowel-Vowel(VV) syllables. For DDK rate, all spoken AMR and SMR in one second were measured by Multi-Speech, and analyzed with Motor Speech Profile for DDK regularity. Error frequency and type in DDK performance were transcribed and auditorily judged by two professional speech pathologists. The findings in this study were follow as: First, DDK rate became faster as the age of children were increased. But there were no statistical differences between the groups for age. Second, there was no significant differences the CV and VV syllables of DDK tasks. Third, the frequency of articulatory error in DDK performance was significantly higher in the age of 4 than other two groups.

Segmenting and Classifying Korean Words based on Syllables Using Instance-Based Learning (사례기반 학습을 이용한 음절기반 한국어 단어 분리 및 범주 결정)

  • Kim, Jae-Hoon;Lee, Kong-Joo
    • The KIPS Transactions:PartB
    • /
    • v.10B no.1
    • /
    • pp.47-56
    • /
    • 2003
  • Korean delimits words by white-space like English, but words In Korean Is a little different in structure from those in English. Words in English generally consist of one word, but those in Korean are composed of one word and/or morpheme or more. Because of this difference, a word between white-spaces is called an Eojeol in Korean. We propose a method for segmenting and classifying Korean words and/or morphemes based on syllables using an instance-based learning. In this paper, elements of feature sets for the instance-based learning are one previous syllable, one current syllable, two next syllables, a final consonant of the current syllable, and two previous categories. Our method shows more than 97% of the F-measure of word segmentation using ETRI corpus and KAIST corpus.

A study on the correlation between sound characteristic and sasang constitution by pitch range and bandwisth (Pitch Range와 Bandwidth를 이용한 음성특성(音聲特性)과 사상체질간(四象體質間)의 상관성(相關性) 연구(硏究))

  • Yang, Sang-mook;Kim, Sun-hyung;Yoo, Jun-sang;Kim, Hyung-seok;Lee, Young-hoon;Kim, Dal-rae
    • Journal of Sasang Constitutional Medicine
    • /
    • v.13 no.3
    • /
    • pp.31-39
    • /
    • 2001
  • Bandwidth and Pitch Range are very important in the area of distinguish of phone which is one of many areas of phonetics and distinguish the individual way of phone. So if each constitution has a trait in its phone, they are important to judge the constitutions. In this report we try to understand the relativity between constitutions and Formant Bandwidth, Pitch Range and the number of syllables in a minute which are important to distinguish the phone. And we try to make judging the constitutions objective. 1. We analyzed Formant Bandwidth and there are some differences between constitutions but it doesn't have any importance in the statistics. 2. We analyzed Pitch Range and there are some differences between constitutions but it doesn't have any importance in the statistics. 3. We analyzed the number of syllables in a minute and there are some differences between constitutions but it doesn't have any importance in the statistics. As mentioned above there are differences between constitutions in Formant Bandwidth, Pitch Range and the numbers of syllables in a minute, but they don't have any importance in the statistics. However if we increase the number of samples and remove noise, there will be great possibility to find some important meanings.

  • PDF

A Study on the Intelligibility of Esophageal Speech (식도발성 발화의 명료도에 대한 연구)

  • Pyo, Hwa-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.5
    • /
    • pp.182-187
    • /
    • 2007
  • The present study was to investigate the speech intelligibility of esophageal speech, which is the way that the laryngectomized people who lost their voices by total laryngectomy can phonate by using the airstream driven into esophagus, not trachea. Three normal listeners transcribed the CVVand VCV syllables produced by 10 esophageal speakers. As a result, overall intelligibility of esophageal speech was 27%. Affricates showed the highest intelligibility, and fricatives, the lowest. In the aspect of the place of articulation, palatals were the most intelligble, and alveolars, the least. Most of the aspirated consonants showed a low intelligibility. The consonants in VCV syllables were more intelligible than the ones in CVV syllables. The low intelligibility of esophageal speakers is due to insufficient airflow intake into esophagus. Therefore, training to increase airflow intake, as well as correct articulation training, will improve their low intelligibility.

A Prosodic Study of Korean Using a Large Database (대용량 데이터베이스를 이용한 한국어 운율 특성에 관한 연구)

  • Kim Jong-Jin;Lee Sook-Hyang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.2
    • /
    • pp.117-126
    • /
    • 2005
  • This study investigates the prosodic characteristics of Korean through the analysis of a large database. One female and one male speakers each read 650 sentences and they were segmentally and prosodically labeled. Statistical analyses were done on these utterances regarding the tonal pattern and the size of prosodic units, correlation between the size of higher level prosodic units and the number of lower level prosodic units. and the slope and F0 of the falling and rising contours of an accentual phrase. The results showed that the duration and the number of words and syllables of a prosodic unit were significantly different not only between speakers but also between its positions within a higher level prosodic nit. The munber of a prosodic unit showed a high correlation with the duration and the number of syllables of its higher level units. The slope of the falling contour within an accentual phrase was inversely Proportional to the number of its syllables. The slope was different depending on the first tone type of an accentual phrase, which could be explained with the F0 rising and the different amount of rising between tones when an accentual phrase starts with an H tone. The slope of the falling contour across an accentual phrase boundary showed a constant and larger value compared to one within an accentual phrase. The rising contours in the beginning and end of an accentual Phrase were similar in their slopes but they differ in the amount of F0 change : the former showed a larger amount of change. The slope of the rising contour which forms an accentual Phrase on its own was inversely Proportional to the number of its syllables.

The Effect of Syllable Frequency, Syllable Type and Final Consonant on Hangeul Word and Pseudo-word Lexical Decision: An Analysis of the Korean Lexicon Project Database (한글 두 글자 단어와 비단어의 어휘판단에 글자 빈도, 글자 유형, 받침이 미치는 영향: KLP 자료의 분석)

  • Myong Seok Shin;ChangHo Park
    • Korean Journal of Cognitive Science
    • /
    • v.34 no.4
    • /
    • pp.277-297
    • /
    • 2023
  • This study attempted to find out how lexical decision of two-syllable words or pseudo-words is affected by syllabic information, such as syllable frequency, syllable (i.e. vowel) type, and presence of final consonant (i.e. batchim), through the analysis of the Korean Lexicon Project Database (KLP-DB). Hierarchical regression of RT data showed that lexical decision of words was influenced by the frequency of the first syllable, the syllable type of the first and second syllables, batchim for the first and second syllables, and also by the interaction of the two syllable types and the interaction of syllable frequency and batchim of the second syllable. For pseudo-words lexical decision was influenced by the frequency of the first and second syllables, syllable type of the first syllable, and batchim for the first and second syllables, and also by the interaction of the two syllable frequencies, the interaction of the two syllable types, and the interaction of syllable frequency and batchim of the first syllable. Word frequency had a strong effect on lexical decision of words, while syllabic information had a stable effect on the lexical decision of pseudo-words. These results indicate that syllabic information should be seriously considered in constructing word and pseudo-word lists and interpreting lexical decision time. Understanding the effect of syllabic information will also contribute to the understanding of word recognition process.

Investigation of the Speech Intelligibility of Classrooms Depending on the Sound Source Location

  • Kim Jeong Tai;Haan Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4E
    • /
    • pp.139-143
    • /
    • 2005
  • The present study aims to investigate the effects of speaker location on the speech intelligibility in a classroom. In order to this, acoustic measurements were undertaken in a classroom with three different sound source locations such as center of front wall (FC), both sides of front wall (FS) and the center of ceiling (CC). SPL, RT, $D_{50}$, RASTI were measured in the 9 measurement points with same sound power level of sound source and MLS was used as the sound source signal. Also, subjective listening tests were carried out using Korean language listening materials which were recorded in an anechoic chamber. The recorded syllables were replayed and recorded again in the classroom with same sound source at three different locations and listening tests were undertaken to 20 respondents who were asked to write the correct syllables which were recorded in the classroom. The results show that higher sound intelligibility ($D_{50}$ of $47\%$, RASTI of 0.56) was obtained when sound source was located at the FS. The results also show that high sound intelligibility was obtained at the area nearby walls.

The prosodic characters of particles in Korean -- focusing on the read speech -- (한국어 조사의 운율적 특성 - 낭독체 문장을 중심으로-)

  • Jun Eun;Lee Sook-hyang
    • MALSORI
    • /
    • no.37
    • /
    • pp.73-85
    • /
    • 1999
  • The prosodic characteristics of Korean particles in read speech were examined in this paper based on K-ToBI labeling system in order to see whether they are prosodically weak form like functions words in English. Acoustic measurements and statistical analysis were done focusing on the distribution of particles over a variety of prosodic positions, prosodic positional effects on the phonetic realization of particles, and acoustic strength of particles compared to those of their surrounding syllables. The panicles were distributed rather equally over all 4 prosodic positions with the highest frequency at IP-medial/AP-final position and the lowest at IP-medial/AP-medial position except that topic marker 'Un/nUn' showed preference for IP-final/AP-final position. There was a significant prosodic positional effect on the duration and F0 of the particles. Duration was the longest at IP-final/AP-final position and interestingly, at IP-medial/AP-medial position while F0 was the highest at IP-final/AP-medial Position as expected. The comparison of the acoustic properties of the particles with those of neighbor syllables showed that duration was generally significantly longer and energy also showed larger values, if not significant, in particles suggesting that the particles in Korean are not prosodically weaker like function words in English.

  • PDF