• Title/Summary/Keyword: phonetic data

Search Result 200, Processing Time 0.021 seconds

The Percentage of Consonants Correct and the Ages of Consonantal Aquisition for 'Korean-Test of Articulation for Children(K-TAC)' (`아동용 조음검사`를 이용한 연령별 자음정확도와 우리말 자음의 습득연령)

  • Kim, Min-Jung;Pae, So-Yeong
    • Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.139-149
    • /
    • 2005
  • The purpose of this study was to propose a preliminary norm for 'Korean-Test of Articulation for Children(K-TAC)'. The K-TAC was designed to test 19 Korean consonants in various phonetic contexts through 37 words. We collected the data of 220 normally developing children aged 2;6(years;months) to 6;5. We analyzed the mean percentage of consonants correct and the age of acquisition for K-TAC. The results were as follows: first, The mean percentage was over 60% at late 2 years of age, over 80% at th age of 3, and over 90% after the age of 4. There were significant differences among age groups. Second, based on the criterion of correct production by 75% of children, Korean children acquired stops and nasals except for SF velars, glottal fricative, SF liquid and affricates by late 2 or 3 years of age. After that they acquired SF velars at the age of 4 and SI liquid at the age of 5. However, they could not acquire alveolar fricatives by the age of late 6. Third, if the distorted sounds were scored as correct, they acquired SI liquid at 4 years of age and alveolar fricatives at 5 years of age.

  • PDF

Chinese Prosody Generation Based on C-ToBI Representation for Text-to-Speech (음성합성을 위한 C-ToBI기반의 중국어 운율 경계와 F0 contour 생성)

  • Kim, Seung-Won;Zheng, Yu;Lee, Gary-Geunbae;Kim, Byeong-Chang
    • MALSORI
    • /
    • no.53
    • /
    • pp.75-92
    • /
    • 2005
  • Prosody Generation Based on C-ToBI Representation for Text-to-SpeechSeungwon Kim, Yu Zheng, Gary Geunbae Lee, Byeongchang KimProsody modeling is critical in developing text-to-speech (TTS) systems where speech synthesis is used to automatically generate natural speech. In this paper, we present a prosody generation architecture based on Chinese Tone and Break Index (C-ToBI) representation. ToBI is a multi-tier representation system based on linguistic knowledge to transcribe events in an utterance. The TTS system which adopts ToBI as an intermediate representation is known to exhibit higher flexibility, modularity and domain/task portability compared with the direct prosody generation TTS systems. However, the cost of corpus preparation is very expensive for practical-level performance because the ToBI labeled corpus has been manually constructed by many prosody experts and normally requires a large amount of data for accurate statistical prosody modeling. This paper proposes a new method which transcribes the C-ToBI labels automatically in Chinese speech. We model Chinese prosody generation as a classification problem and apply conditional Maximum Entropy (ME) classification to this problem. We empirically verify the usefulness of various natural language and phonology features to make well-integrated features for ME framework.

  • PDF

Fast ab/adduction Rate of Articulation Valves in Normal Adults (정상 성인의 조음밸브에 대한 내${\cdot}$외전 비율)

  • Park, Hee-Jun;Han, Ji-Yeon
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.149-151
    • /
    • 2007
  • This study was designed to investigate fast ab/adduction rate of articulation valves in normal adults. The measurement of fast ab/aduction rate has traditionally been used for assessment, diagnosis and therapy in patients who suffered from dysarthria, functional articulation disorders or apraxia of speech. Fast ab/adduction rate shows the documented structural and physiological changes in the central nervous system and the peripheral components of oral and speech production mechanism. Fast ab/adduction rates were obtained from 20 normal subjects by producing the repetition of vocal function (/ihi/), tongue function (/t${\wedge}$/), velopharyngeal function (/m/), and labial function (/p${\wedge}$/). The Aerophone II was used for data recording. The results of finding as follows: average fast ab/adduction rates were vocal function(6.21cps), tongue function(7.42cps), velopharyngeal function(5.23cps), labial function (6.93cps). The results of this study are guidelines of normal diadochokinetic rates. In addition, they can indicate the severity of diseases and evaluation of treatment.

  • PDF

Speech pathologic evaluation of children with ankyloglossia (설유착증 환자의 언어병리학적 평가)

  • Lee, Ju-Kyung
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.155-157
    • /
    • 2007
  • Objective : There are close relationship between intraoral abnormal structure and speech-functional problem. Patients with cleft palate & ankyloglossia are typical examples. Patients with abnormal structure can be repaired toward normal structure by operation. Ankyloglossia may cause functional limitation - for example, speech disorder - even if adequate surgical treatment were done. And, each individuals have each speech disorders. The objective of this study is to evaluate the speechs of childrens with ankyloglossia, and to determine whether ankyloglossia is associated with articulation problem. We wanted to present criteria for indication of frenectomy. Study design The experimental group is composed of 10 childrens who visited our department of oral and maxillofacial surgery, dental hospital, Chonbuk university, due to ankyloglossia and articulation problem,. The average age is 5 Y 7M, M : F ratio is 4 : 1 at the time of speech test. The VPI consonant discrimination degree, PPVT, PCAT, Nasometer II, Visi-Pitch test result were obtained from each group. Result : There was significant difference for 'language development' through PPVT. Except 3 members of experimental group, all remainder showed retardation for 'language development'. For 'errored consonant rate', data showed more higher scores in alveolar consonant. There 'consonant error' in experimental group, mostly showed 'alveolar consonant', also a major modality of 'consonant error' was mostly distortion. Conclusion : We can judge the severity of ankyloglossia patient by examinig language development degree & speech test of 'alveolar consonant' . And we can make a decision for frenulotomy using these results.

  • PDF

An Analysis of Acoustic Features Caused by Articulatory Changes for Korean Distant-Talking Speech

  • Kim Sunhee;Park Soyoung;Yoo Chang D.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.2E
    • /
    • pp.71-76
    • /
    • 2005
  • Compared to normal speech, distant-talking speech is characterized by the acoustic effect due to interfering sound and echoes as well as articulatory changes resulting from the speaker's effort to be more intelligible. In this paper, the acoustic features for distant-talking speech due to the articulatory changes will be analyzed and compared with those of the Lombard effect. In order to examine the effect of different distances and articulatory changes, speech recognition experiments were conducted for normal speech as well as distant-talking speech at different distances using HTK. The speech data used in this study consist of 4500 distant-talking utterances and 4500 normal utterances of 90 speakers (56 males and 34 females). Acoustic features selected for the analysis were duration, formants (F1 and F2), fundamental frequency, total energy and energy distribution. The results show that the acoustic-phonetic features for distant-talking speech correspond mostly to those of Lombard speech, in that the main resulting acoustic changes between normal and distant-talking speech are the increase in vowel duration, the shift in first and second formant, the increase in fundamental frequency, the increase in total energy and the shift in energy from low frequency band to middle or high bands.

A study of phonological regression in 2-6 years of Korean children (서울-경기 지역 2-6세 아동의 발달기적 음운변동에 관한 연구 - 자음을 중심으로 -)

  • Kim Young-Tae
    • MALSORI
    • /
    • no.21_24
    • /
    • pp.3-24
    • /
    • 1992
  • This study was designed to investigate the changes of phonological processes in normal Korean children aged from 2- to 6-years. Forty eight children who lived in Seoul or Kyung-Ki do were tested with a picture articulation test and their articulation errors including omissions, additions and substitutions were coded into phonological processes. Those phonological processes were discussed in several ways: syllable structure, place, manner, assimilation, tenseness, and aspiration of sounds. Data were analyzed by two ways: (1) number of subjects who showed each process and (2) percentage of occurrence of each process. Analyses in omission-addition processes demonstrated that postvocalic omission occurred most frequently, followed by velar-, alveolar-, and glottal omission. Analyses in substitution processes showed that fronting (palatal and velar), backing (alveolar), and alveolization occurred most frequently in terms of the place of sounds. In terms of assimilation, alveolar-, stopping, and aspiration assimilation occurred frequently. Analyses by the tenseness and aspiration showed similar occurrences among the 4 processes, with slightly higher occurrences in tensing and aspiration than lanxing and deaspiration. All of the processes decreased by age. The numbers of the processes showed by more than half of the children or exceeded 10% of occurrence were 20 in 2-years of age, 10 in 3-years of age, 1 in 4-years of age, and none in ages of 5 and 6.

  • PDF

An Acoustical Study of Korean Diphthongs (한국어 이중모음의 음향학적 연구)

  • Yang Byeong-Gon
    • MALSORI
    • /
    • no.25_26
    • /
    • pp.3-26
    • /
    • 1993
  • The goals of the present study were (3) to collect and analyze sets of fundamental frequency (F0) and formant frequency (F1, F2, F3) data of Korean diphthongs from ten linguistically homogeneous speakers of Korean males, and (2) to make a comparative study of Korean monophthongs and diphthongs. Various definitions, kinds, and previous studies of diphthongs were examined in the introduction. Procedures for screening subjects to form a linguistically homogeneous group, time point selection and formant determination were explained in the following section. The principal findings were as follows: 1. Much variation was observed in the ongliding part of diphthongs. 2. F2 values of (j) group descended while those of [w] group ascended, 3. The average duration of diphthongs were about 110 msec, and there was not much variation between speakers and diphthongs. 4. In a comparative study of monophthongs and diphthongs, Fl and F2 values of the same offgliding part at the third time point almost converged. 5. The gliding of diphthongs was very short beginning from the h-noise. Perceptual studies using speech synthesis are desirable to find major parameters for diphthongs. The results of the present study wi11 be useful in the area of automated speech recognition and computer synthesis of speech.

  • PDF

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

  • Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.391-394
    • /
    • 2004
  • This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.

  • PDF

Mieko Han and her Works on Korean Phonetics (Mieko Han의 한국어 음성학 연구)

  • Ko, Do-Heung
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.213-223
    • /
    • 1997
  • This paper deals with a general review of Mieko S. Han, who made a significant contribution to the studies of Korean phonetics during the 1960' s and early 1970' s. As both a single and joint author, Dr. Han published important papers in both quantity and quality, which have been cited among Korean phoneticians until today. Before Dr. M. Han' s work, professor of USC in the department of East Asian Languages & Cultures, there were only a few phonetics-related publications in Korea, most of which are papers or books based on non-experimental traditional approach. It is known that there was coexistence between traditionalism and structuralism in the field of Korean linguistics. It was, however, fortunate that we had two important phoneticians (M. Han and Chin-W Kim) abroad at that time. Mieko Han' s concern was to investigate experimental characteristics of the system of Korean vowels and consonants using a Spectrograph, which was the single most important tool for analysing phonetic data at that time. Dr. Han conducted her experimental studies on Korean phonetics, mostly funded by the Office of Naval Research, in terms of duration, fundamental frequency, Voice Onset Time (VOT), intensity, and so on. This paper aims to re-appreciate Dr. Han's specific contribution to the study of Korean phonetics since she played an important role as a pioneer of early Korean phonetics. Further, it is highly recommended that Dr. Han's works can be extremely useful for a graduate student, who seriously would like to specialize in Korean phonetics in the first step.

  • PDF

Developing a Korean standard speech DB (II) (한국인 표준 음성 DB 구축(II))

  • Shin, Jiyoung;Kim, KyungWha
    • Phonetics and Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.9-22
    • /
    • 2017
  • The purpose of this paper is to report the whole process of developing Korean Standard Speech Database (KSS DB). This project is supported by SPO (Supreme Prosecutors' Office) research grant for three years from 2014 to 2016. KSS DB is designed to provide speech data for acoustic-phonetic and phonological studies and speaker recognition system. For the samples to represent the spoken Korean, sociolinguistic factors, such as region (9 regional dialects), age (5 age groups over 20) and gender (male and female) were considered. The goal of the project is to collect over 3,000 male and female speakers of nine regional dialects and five age groups employing direct and indirect methods. Speech samples of 3,191 speakers (2,829 speakers and 362 speakers using direct and indirect methods, respectively) are collected and databased. KSS DB designs to collect read and spontaneous speech samples from each speaker carrying out 5 speech tasks: three (pseudo-)spontaneous speech tasks (producing prolonged simple vowels, 28 blanked sentences and spontaneous talk) and two read speech tasks (reading 55 phonetically and phonologically rich sentences and reading three short passages). KSS DB includes a 16-bit, 44.1kHz speech waveform file and a orthographic file for each speech task.