• Title/Summary/Keyword: Speech sound error

Search Result 33, Processing Time 0.021 seconds

Alveolar Fricative Sound Errors by the Type of Morpheme in the Spontaneous Speech of 3- and 4-Year-Old Children (자발화에 나타난 형태소 유형에 따른 3-4세 아동의 치경마찰음 오류)

  • Kim, Soo-Jin;Kim, Jung-Mee;Yoon, Mi-Sun;Chang, Moon-Soo;Cha, Jae-Eun
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.129-136
    • /
    • 2012
  • Korean alveolar fricatives are late-developing speech sounds. Most previous research on phonemes used individual words or pseudo words to produce sounds, but word-level phonological analysis does not always reflect a child's practical articulation ability. Also, there has been limited research on articulation development looking at speech production by grammatical morphemes despite its importance in Korean language. Therefore, this research examines the articulation development and phonological patterns of the /s/ phoneme in terms of morphological types produced in children's spontaneous conversational speech. The subjects were twenty-two typically developing 3- and 4-year-old Koreans. All children showed normal levels in three screening tests: hearing, vocabulary, and articulation. Spontaneous conversational samples were recorded at the children's homes. The results are as follows. The error rates decreased with increasing age in all morphological contexts. Also, error percentages within an age group were significantly lower in lexical morphemes than in grammatical morphemes. The stopping of fricative sounds was the main error pattern in all morphological contexts and reduced as age increased. This research shows that articulation performance can differ significantly by morphological contexts. The present study provides data that can be used to identify the difficult context for articulatory evaluation and therapy of alveolar fricative sounds.

Korean speech sound development in children from bilingual Japanese-Korean environments

  • Kim, Jeoung-Suk;Lee, Jun-Ho;Choi, Yoon-Mi;Kim, Hyun-Gi;Kim, Sung-Hwan;Lee, Min-Kyung;Kim, Sun-Jun
    • Clinical and Experimental Pediatrics
    • /
    • v.53 no.9
    • /
    • pp.834-839
    • /
    • 2010
  • Purpose: This study investigates Korean speech sound development, including articulatory error patterns, among the Japanese-Korean children whose mothers are Japanese immigrants to Korea. Methods: The subjects were 28 Japanese-Korean children with normal development born to Japanese women immigrants who lived in Jeonbuk province, Korea. They were assessed through Computerized Speech Lab 4500. The control group consisted of 15 Korean children who lived in the same area. Results: The values of the voice onset time of consonants /$p^h$/, /t/, /$t^h$/, and/$k^*$/ among the children were prolonged. The children replaced the lenis sounds with aspirated or fortis sounds rather than replacing the fortis sounds with lenis or aspirated sounds, which are typical among Japanese immigrants. The children showed numerous articulatory errors for /c/ and /I/ sounds (similar to Koreans) rather than errors on /p/ sounds, which are more frequent among Japanese immigrants. The vowel formants of the children showed a significantly prolonged vowel /o/ as compared to that of Korean children ($P$<0.05). The Japanese immigrants and their children showed a similar substitution /n/ for /ɧ/ [Japanese immigrants (62.5%) vs Japanese-Korean children (14.3%)], which is rarely seen among Koreans. Conclusion: The findings suggest that Korean speech sound development among Japanese-Korean children is influenced not only by the Korean language environment but also by their maternal language. Therefore, appropriate language education programs may be warranted not only or immigrant women but also for their children.

Adaptive Multi-Rate(AMR) Speech Coding Algorithm (Adaptive Multi-Rate(AMR) 음성부호화 알고리즘)

  • 서정욱;배건성
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.92-97
    • /
    • 2000
  • An AMR(Adaptive Multi-Rate) speech coding algorithm has been adopted as a standard speech codec for IMT-2000. It is based on the algebraic CELP, and consists of eight speech coding modes having the bit rate from 4.75 kbit/s to 12.2 kbit/s. It also contains the VAD(Voice Activity Detector), SCR (Source Controlled Rate) operation, and error concealment scheme for robustness in a radio channel. The bit rate of AMR is changed on a frame basis depending on the channel condition. In this paper, we introduced AMR speech coding algorithm and performed the real-time implementation using TMS320C6201, i.e., a Texas Instrument's fixed-point DSP. With the ANSI C source code released from ETSI and 3GPP, we convert and optimize the program to make it run in real time using the C compiler and assembly language. It is verified that the decoded result of the implemented speech codec on the DSP is identical with the PC simulation result using ANSI C code for test sequences. Also, actual sound input/output test using microphone and speaker demonstrates its proper real-time operation without distortions or delays.

  • PDF

A Range Dependent Structural HRTF Model for 3-D Sound Generation in Virtual Environments (가상현실 환경에서의 3차원 사운드 생성을 위한 거리 변화에 따른 구조적 머리전달함수 모델)

  • Lee, Young-Han;Kim, Hong-Kook
    • MALSORI
    • /
    • no.59
    • /
    • pp.89-99
    • /
    • 2006
  • This paper proposes a new structural head-related transfer function(HRTF) model to produce sounds in a virtual environment. The proposed HRTF model generates 3-D sounds by using a head model, a pinna model and the proposed distance model for azimuth, elevation, and distance that are three aspects for 3-D sounds, respectively. In particular, the proposed distance model consists of level normalization block distal region model, and proximal region model. To evaluate the performance of the proposed model, we setup an experimental procedure that each listener identifies a distance of 3-D sound sources that are generated by the proposed method with a predefined distance. It is shown from the tests that the proposed model provides an average distance error of $0.13{\sim}0.31$ meter when the sound source is generated as if it is 0.5 meter $\sim$ 2 meters apart from the listeners. This result is comparable to the average distance error of the human listening for the actual sound source.

  • PDF

Implementation of Korean TTS System based on Natural Language Processing (자연어 처리 기반 한국어 TTS 시스템 구현)

  • Kim Byeongchang;Lee Gary Geunbae
    • MALSORI
    • /
    • no.46
    • /
    • pp.51-64
    • /
    • 2003
  • In order to produce high quality synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model from texts using natural language processing. Robust preprocessing for non-Korean characters should also be required. In this paper, we analyzed Korean texts using a morphological analyzer, part-of-speech tagger and syntactic chunker. We present a new grapheme-to-phoneme conversion method for Korean using a hybrid method with a phonetic pattern dictionary and CCV (consonant vowel) LTS (letter to sound) rules, for unlimited vocabulary Korean TTS. We constructed a prosody model using a probabilistic method and decision tree-based method. The probabilistic method atone usually suffers from performance degradation due to inherent data sparseness problems. So we adopted tree-based error correction to overcome these training data limitations.

  • PDF

A Study on Measuring the Speaking Rate of Speaking Signal by Using Line Spectrum Pair Coefficients

  • Jang, Kyung-A;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3E
    • /
    • pp.18-24
    • /
    • 2001
  • Speaking rate represents how many phonemes in speech signal have in limited time. It is various and changeable depending on the speakers and the characters of each phoneme. The preprocessing to remove the effect of variety of speaking rate is necessary before recognizing the speech in the present speech recognition systems. So if it is possible to estimate the speaking rate in advance, the performance of speech recognition can be higher. However, the conventional speech vocoder decides the transmission rate for analyzing the fixed period no regardless of the variety rate of phoneme but if the speaking rate can be estimated in advance, it is very important information of speech to use in speech coding part as well. It increases the quality of sound in vocoder as well as applies the variable transmission rate. In this paper, we propose the method for presenting the speaking rate as parameter in speech vocoder. To estimate the speaking rate, the variety of phoneme is estimated and the Line Spectrum Pairs is used to estimate it. As a result of comparing the speaking rate performance with the proposed algorithm and passivity method worked by eye, error between two methods is 5.38% about fast utterance and 1.78% about slow utterance and the accuracy between two methods is 98% about slow utterance and 94% about fast utterances in 30 dB SNR and 10 dB SNR respectively.

  • PDF

Sound Source Localization Technique at a Long Distance for Intelligent Service Robot (지능형 서비스 로봇을 위한 원거리 음원 추적 기술)

  • Lee Ji-Yeoun;Hahn Min-Soo
    • MALSORI
    • /
    • no.57
    • /
    • pp.85-97
    • /
    • 2006
  • This paper suggests an algorithm that can estimate the direction of the sound source in real time. The algorithm uses the time difference and sound intensity information among the recorded sound source by four microphones. Also, to deal with noise of robot itself, the Kalman filter is implemented. The proposed method can take shorter execution time than that of an existing algorithm to fit the real-time service robot. Also, using the Kalman filter, signal ratio relative to background noise, SNR, is approximately improved to 8 dB. And the estimation result of azimuth shows relatively small error within the range of ${\pm}7$ degree.

  • PDF

A Study on the Acoustical Characteristics of Pistol Impluse and MLS Source Measurements in Room Types (음향측정시 실의 종류와 음원에 따르는 음향인자 측정분석에 관한 연구)

  • Kim, Jeong-Jung;Son, Jang-Ryeol
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2004.11a
    • /
    • pp.1028-1031
    • /
    • 2004
  • Last target of architectural acoustics is that people wish to convey voice effectively from the space adaptively in use purpose in building. But, to how exactly through space sound source that wish to deliver from indoor can be passed does quantification sound estimation method is proposing various kinds physical parameter to estimate degree of voice definition (Speech articulation) and reverberation. Result that evaluate sound source about MLS signal and Impluse signal by pistol in this treatise could know that converge in MLS and measurement error extent about reverberation time(RT) of sound benevolent person. And value is thought there is problem showing change irregularly about sound benevolent person of D50, C80 etc. Finally, in case is spread sound field in difference of sound pressure level, when measure about change of sound pressure level, sound benevolent person could know that there is no different effect.

  • PDF

What Effect can Simple Hand Tapping Have on the Accuracy and Fluency of Speech Production in Children With and Without Speech Sound Disorders? (단순 손동작 반복이 말소리장애 아동과 일반 아동의 말소리산출의 정확성과 유창성에 미치는 영향)

  • Shin, Yu-Na;Ha, Ji-Wan
    • Therapeutic Science for Rehabilitation
    • /
    • v.8 no.2
    • /
    • pp.67-78
    • /
    • 2019
  • Objective : The purpose of this study was to investigate the effect of hand tapping on the accuracy and the fluency of speech production in children with speech sound disorder(SSD) and their typically developing peers(TD). Methods : The study subjects were 15 SSD children and 15 TD children aged 4, 5, and 6 years of age. Subjects were asked to give a picture name without hand tapping in the first experimental condition, and with hand tapping in the second experiment condition. Results : The results showed that hand tapping significantly increased disfluency in TD, whereas in SSD. it did not affect the accuracy or fluency of speech production. In addition, TD demonstrated a significant positive correlation with the changes of accuracy and disfluency due to hand tapping, whereas SSD had no correlation. Conclusion : We discussed the possibility that hand tapping could serve as an obstacle distracting attention from SSD and TD, acting as a motor gesture to facilitate phonological processing when facing the difficulty in lexical retrieval for SSD.

Design of Model to Recognize Emotional States in a Speech

  • Kim Yi-Gon;Bae Young-Chul
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.6 no.1
    • /
    • pp.27-32
    • /
    • 2006
  • Verbal communication is the most commonly used mean of communication. A spoken word carries a lot of informations about speakers and their emotional states. In this paper we designed a model to recognize emotional states in a speech, a first phase of two phases in developing a toy machine that recognizes emotional states in a speech. We conducted an experiment to extract and analyse the emotional state of a speaker in relation with speech. To analyse the signal output we referred to three characteristics of sound as vector inputs and they are the followings: frequency, intensity, and period of tones. Also we made use of eight basic emotional parameters: surprise, anger, sadness, expectancy, acceptance, joy, hate, and fear which were portrayed by five selected students. In order to facilitate the differentiation of each spectrum features, we used the wavelet transform analysis. We applied ANFIS (Adaptive Neuro Fuzzy Inference System) in designing an emotion recognition model from a speech. In our findings, inference error was about 10%. The result of our experiment reveals that about 85% of the model applied is effective and reliable.