Search | Korea Science

Performance of Vocal Tract Area Estimation from Deaf and Normal Children's Speech (청각장애아동과 건청아동의 성도면적 추정 성능)

Kim Se-Hwan;Kim Nam;Kwon Oh-Wook
- MALSORI
- /
- no.56
- /
- pp.159-172
- /
- 2005
This paper analyzes the vocal tract area estimation algorithm used as a part of a speech analysis program to help deaf children correct their pronunciations by comparing their vocal tract shape with normal children's. Assuming that a vocal tract is a concatenation of cylinder tubes with a different cross section, we compute the relative vocal tract area of each tube using the reflection coefficients obtained from linear predictive coding. Then, we obtain the absolute vocal tract area by computing the height of lip opening with a formula modified for children's speech. Using the speech data for five Korean vowels (/a/, /e/, /i/, /o/, and /u/), we investigate the effects of the sampling frequency, frame size, and model order on the estimated vocal tract shape. We compare the vocal tract shapes obtained from deaf and normal children's speech.
PDF

Vocal Tract Modeling with Unfixed Sectionlength Acoustic Tubes(USLAT) (비고정 구간 길이 음향 튜브를 이용한 성도 모델링)

Kim, Dong-Jun
- The Transactions of The Korean Institute of Electrical Engineers
- /
- v.59 no.6
- /
- pp.1126-1130
- /
- 2010
Speech production can be viewed as a filtering operation in which a sound source excites a vocal tract filter. The vocal tract is modeled as a chain of cylinders of varying cross-sectional area in linear prediction acoustic tube modeling. In this modeling the most common implementation assumes equal length of tube sections. Therefore, to model complex vocal tract shapes, a large number of tube sections are needed. This paper proposes a new vocal tract model with unfixed sectionlengths, which uses the reduced lattice filter for modeling the vocal tract. This model transforms the lattice filter to reduced structure and the Burg algorithm to modified version. When the conventional and the proposed models are implemented with the same order of linear prediction analysis, the proposed model can produce more accurate results than the conventional one. To implement a system within similar accuracy level, it may be possible to reduce the stages of the lattice filter structure. The proposed model produces the more similar vocal tract shape than the conventional one.
https://doi.org/10.5370/KIEE.2010.59.6.1126 인용 PDF KSCI

A Study on Vowel Formant Variation by Vocal Tract Modification (성도 변형에 따른 모음 포먼트의 변화 고찰)

Yang, Byung-Gon
- Speech Sciences
- /
- v.3
- /
- pp.83-92
- /
- 1998
Vowels are classified by vocal tract shapes. These shapes form constriction points along the tract, which have an influence on such vocal tract resonance as $F_l,\;F_2,\;F_3$, and so on. This study reviews the perturbation theory of the tract and determines the corresponding formant frequencies from modified vocal tracts using vocal tract area function. Then, formant variation is observed from the theory. Finally, each set of $F_l,\;F_2,\;and\;F_3$ frequency is input to a speech synthesis software to make a vowel sound. Auditory impression of each sound without any modification of its vocal tract shape is almost the same as the corresponding phonetic symbol. Formant frequencies of $F_l,\;F_2,\;F_3$ vary according to the perturbation theory. Generally, constriction along the node causes formant values to decrease while constriction along the anti-node cause it to increase. Vocal tracts modified by more than $3\;cm^2$ change vowel qualities of /a/ and /i/ into those of f /v/ and /$\varepsilon$/, respectively. This study will be helpful in simulating sounds from modified vocal tracts before any operation. Further studies are desirable to compare vocal tract shapes of various languages and their sounds together.
PDF

Vocal Tract Area Estimation from Deaf and Normal Children's Speech (청각장애아 및 건청아 음성으로부터 성도 면적 추정)

Kim, Se-Hwan;Kwon, Oh-Wook
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.51-54
- /
- 2005
This paper analyzes the vocal tract area estimation algorithm used as a part of a speech analysis program to help deaf children correct their pronunciations by comparing their vocal tract shape with normal children's. Assuming that a vocal tract is a concatenation of cylinder tubes with a different cross section, we compute the relative vocal tract area of each tube using the reflection coefficients obtained from linear predictive coding. Then, obtain the absolute vocal tract area by computing the height of lip opening with a formula modified for children's speech. Using the speech data for five Korean vowels (/a/, /e/, /i/, /o/, and /u/), we investigate the effects of the sampling frequency, frame size, and model order. We compare vocal tract shapes obtained from deaf and normal children's speech.
PDF

Determining the Relative Differences of Emotional Speech Using Vocal Tract Ratio

Wang, Jianglin;Jo, Cheol-Woo
- Speech Sciences
- /
- v.13 no.1
- /
- pp.109-116
- /
- 2006
In this paper, our study focuses on obtaining the differences of emotional speech in three different vocal tract sections. The vocal tract area was computed from the area function of the emotional speech. The total vocal tract was divided into 3 sections (vocal fold section, middle section and lip section) to acquire the differences in each vocal tract section of emotional speech. The experiment data include 6 emotional speeches from 3 males and 3 females. The 6 emotions consist of neutral, happiness, anger, sadness, fear and boredom. The measured difference is computed by the ratio through comparing each emotional speech with the normal speech. The experimental results present that there is not a remarkable difference at lip section, but the fear and sadness have a great change at the vocal fold part.
PDF

On the Implementation of Articulatory Speech Simulator Using MRI (MRI를 이용한 조음모델시뮬레이터 구현에 관하여)

Jo, Cheol-Woo
- Speech Sciences
- /
- v.2
- /
- pp.45-55
- /
- 1997
This paper describes the procedure of implementing an articulatory speech simulator, in order to model the human articulatory organs and to synthesize speech from this model after. Images required to construct the vocal tract model were obtained from MRI, they were then used to construct 2D and 3D vocal tract shapes. In this paper 3D vocal tract shapes were constructed by spatially concatenating and interpolating sectional MRI images. 2D vocal tract shapes were constructed and analyzed automatically into a digital filter model. Following this speech sounds corresponding to the model were then synthesized from the filter. All procedures in this study were using MATLAB.
PDF

Vocal Tract Length Normalization for Speech Recognition (음성인식을 위한 성도 길이 정규화)

지상문
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.7 no.7
- /
- pp.1380-1386
- /
- 2003
Speech recognition performance is degraded by the variation in vocal tract length among speakers. In this paper, we have used a vocal tract length normalization method wherein the frequency axis of the short-time spectrum associated with a speaker's speech is scaled to minimize the effects of speaker's vocal tract length on the speech recognition performance In order to normalize vocal tract length, we tried several frequency warping functions such as linear and piece-wise linear function. Variable interval piece-wise linear warping function is proposed to effectively model the variation of frequency axis scale due to the large variation of vocal tract length. Experimental results on TIDIGITS connected digits showed the dramatic reduction of word error rates from 2.15% to 0.53% by the proposed vocal tract normalization.
PDF KSCI

The Change of the Length of Vocal Tract in Singers according to the Phonation at Different Levels of Pitch (성악인에서 발성 시 음의 높낮이에 따른 성도 길이의 변화)

Ban, Jae-Ho;Kim, Chang-Gyu;Lee, Sang-Hyuk;Lee, Kyung-Chul;Jin, Sung-Min
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.17 no.1
- /
- pp.14-16
- /
- 2006
Background and Objectives: The purpose of this study is to investigate the change of vocal tract length according to the level of the pitch by the singers. Materials and Methods: Fifteen tenors were asked to produce successive /a/ sound in G4(382Hz) for the head register, C3(131Hz) for the chest register and usual speaking sound. The control group consisted of 15 males of an similar age who are not professional singers. The length of vocal tract was calculated by applying the formula of Fn=(2n-1) c/4L(F : formant frequency, c : the speed of sound in the vocal tract(350m/sec), L : length of vocal tract, $n=1,2,3,4,{\ldots}{\infty}$). Results: In singer's group, there showed no significant statistical difference of length among head and chest register and usual speaking sound. However in the control group, there showed statistically significant difference of length. Comparison of the absolute difference in the length of vocal tract by changing level of pitch in phonation, between the control group and the singers group. Changing from G4 phonation to C3 phonation and C3 phonation to usual speaking sound showed statistically difference of vocal tract length was less in the singers group than the control group. Conclusion: The change of vocal tract length, in either speaking or singing, was less in singers than the control group. We could assume that the singers maintain their larynx position constantly throughout the pitch range when phonation.
PDF

A Vowel Discrimination of Korean Monophthongs [i, e, a, o, u, ${\omega}$] Using Vocal Tract Magnetic Resonance Image and F1/F2 (성도 자기공명 영상과 음향정보(F1/F2)를 이용한 한국어 단모음 [이, 에, 아, 오, 우, 으] 판별)

Seong, Cheol-Jae;Park, Jong-Won;Kim, Gui-Ryong
- MALSORI
- /
- no.56
- /
- pp.103-125
- /
- 2005
We present a new method of measuring the volume and cross-sectional area of the vocal tract from magnetic resonance images. The vocal tract was divided by the 2 constriction points on the horizontal and vertical planes. The ratios of the volumes of the segment vocal tracts to that of the entire vocal tract play a crucial role in discriminating Korean monophthongs in that vowels were successfully discriminated by the ratios. The discriminant analysis also demonstrated that the acoustic parameters F1 and F2, in addition to the segment volumes, serve as significant parameters in discriminating Korean monophthongs.
PDF

A Study on Speech Recognition using Vocal Tract Area function and Vector Quantization (성도 면적 함수와 벡터 양자화를 이용한 음성 인식에 관한 연구)

Song, Jei-Hyuck;Kim, Dong-Jun;Park, Sang-Hui
- Proceedings of the KOSOMBE Conference
- /
- v.1993 no.11
- /
- pp.171-174
- /
- 1993
We propose the vocal tract area function as the feature vector of speech recognition. Vocal tract area function is directly related to speech production. The vocal tract area function is not only showing mechanism of speech production but also can be used as an effective feature vector in speech, recognition in this study.
PDF

Search Result 173, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)