Search | Korea Science

Efficient Tracking of Speech Formant Using Closed Phase WRLS-VFF-VT Algorithm

Lee, Kyo-Sik;Park, Kyu-Sik
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.2E
- /
- pp.8-13
- /
- 2000
In this paper, we present an adaptive formant tracking algorithm for speech using closed phase WRLS-VFF-VT method. The pitch synchronous closed phase methods is known to give more accurate estimates of the vocal tract parameters than the pitch asynchronous method. However the use of a pitch-synchronous closed phase analysis method has been limited due to difficulties associated with the task of accurately isolating the closed phase region in successive periods of speech. Therefore we have implemented the pitch synchronous closed phase WRLS-VFF-VT algorithm for speech analysis, especially for formant tracking. The proposed algorithm with the variable threshold(VT) can provide a superior performance in the boundary of phone and voiced/unvoiced sound. The proposed method is experimentally compared with the other method such as two channel CPC method by using synthetic waveform and real speech data. From the experimental results, we found that the block data processing techniques, such as the two-channel CPC, gave reasonable estimates of the formant/antiformant. However, the data windows used by these methods included the effects of the periodic excitation pulses, which affected the accuracy of the estimated formants. On the other hand the proposed WRLS-VFF-VT method, which eliminated the influence of the pulse excitation by using an input estimation as part of the algorithm, gave very accurate formant/bandwidth estimates and good spectral matching.
PDF

Monophthong Analysis on a Large-scale Speech Corpus of Read-Style Korean (한국어 대용량발화말뭉치의 단모음분석)

Yoon, Tae-Jin;Kang, Yoonjung
- Phonetics and Speech Sciences
- /
- v.6 no.3
- /
- pp.139-145
- /
- 2014
The paper describes methods of conducting vowel analysis from a large-scale corpus with the aids of forced alignment and optimal formant ceiling methods. 'Read Style Corpus of Standard Korean' is used for building the forced alignment system and a subset of the corpus for the processing and extraction of features for vowel analysis based on optimal formant ceiling. The results of the vowel analysis are reliable and comparable to the results obtained using traditional analytical methods. The findings indicate that the methods adopted for the analysis can be extended and be used for more fine-grained analysis without time-consuming manual labeling without losing accuracy and reliability.
https://doi.org/10.13064/KSSS.2014.6.3.139 인용 PDF KSCI

Oral and Nasal Spectral Outputs in Korean Oral Vowels (정상 모음에 대한 구강 및 비강 spectral output 분석)

Hong, Ki-Hwan;Choi, Seung-Chul;Kim, Byum-Kyu;Yang, Yoon-Soo;Shim, Hyun-Ah
- Speech Sciences
- /
- v.10 no.2
- /
- pp.145-157
- /
- 2003
Vowels are classified by the shapes of vocal tract. These shapes form constriction points along the tract, which have an influence on such vocal tract resonance as F1, F2, F3, and so on. The formant frequency is influenced by aperture and placement of tongue and the intensity is influenced by air pressure of subglottis. The object of this study compares to characterize the spectral outputs of oral and nasal spectra for the formant frequencies and intensity of Korean oral vowels. Subjects consisted of 20 normal persons (10 male and 10 female) without laryngeal pathology. The speech sample included /a/, /e/, /i/, /o/, /u/ of Korean oral vowels. The spectrum of each vowel was analysed by Nasal View and Real Analysis Program using Dr. Speech. The result showed that nasal intensity is decreased manifestly from F1 to F2. But oral intensity and Intensity is decreased little bit from F1 to F2. The most of values of nasal formant frequency is similarity oral formant frequency and Formant frequency or little bit smaller.
PDF

A Study on the Formant Analysis of Korean Monophthongs and their Resonance Effect in Vocal Tract (한글 단모음의 포만트 분석과 성도내의 공명효과에 관한 연구)

Sin, Hyeon-Jae;Yun, Seok-Wang
- The Journal of the Acoustical Society of Korea
- /
- v.6 no.2
- /
- pp.30-37
- /
- 1987
Twelve Korean monophthongs were studied by formant analysis, fundamental frequencies and their harmonics were considered as the parameters of analysis. The analyzed data were twelve Korean monophthongs which were pronounced with the five fundamental frequencies by the five male vocal musicians. The study shows that the first and the second formants are characterized by the resonance of the cavities of pharymx and mouth, respectively. The lip rounding effect detreases the second formant frequency. The phonemes of $[a]/[\alpha ], [e]/[\varepsilon] and [\partial]/[\Lambda]$were not distinguished well in this formant analysis.
PDF

Nasometric and Acoustic Analysis in Experimentally Induced Velopharyngeal Insufficiency in Human (사람에서 유발시킨 구개인두부전증의 비음도와 음향학적 분석)

윤자복;성명훈;정원호;김광현
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.8 no.2
- /
- pp.210-216
- /
- 1997
Many tools have been used to evaluate the voice abnormalities of velopharyngeal insufficiency(VPI). The aim of study was to obtain the objective evaluation method of VPI by comparing the acoustic and nasalance data of experimentally induced VPI group and those of normal control group. Ten healthy young men were included in this study Mild and severe VPI were experimentally induced by retracting velopharyngeal movement. Using the nasometer, we obtained the nasalance score of the sustained oral vowels and those of three types of nasometer passages and the slope scores of nasogram of nasal words. And we analysed the change of formant frequencies for the sustained oral vowels and the changes of various parameters of hyper-tnasality by the computerized speech analysis system. The nasalance score of sustained /a/ was increased significantly in VPI conditions. There was no changes in the slope score of nasogram. On the acoustic speech analysis, the second formant frequencies of vowel /e/ and /i/ were decreased significantly in VPI conditions. This results suggested that the measurement of nasalance score and formant frequency might be useful in the evaluation of VPI.
PDF

Analysis of Singer's Formant & Close Quotient During Change of the Larynx Position (후두위치의 변화에 따른 Singer's Formant와 성대접촉률의 변화 연구)

Nam, Do-Hyun;Choi, Seong-Hee;Choi, Jae-Nam;Chun, Suck-Pil;Choi, Hong-Shik
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.15 no.2
- /
- pp.98-111
- /
- 2004
Background and Objectives : The purpose of this study is to analyze the difference of Fundamental Frequency(Hz), Closed Quotient(Qx ; %), Intensity(dB), Vocal tract length and width(cm), formant frequency(Hz), level of formant frequency(dB) depending on the larynx position. Materials and Methods : One professional male singer(career : 28 years) produced sustained vowel /a/,/e/,/i/,/o/,/u/ in two larynx position (higher, lower) with Dr. Speech and video fluoroscopy was used to quantify the vocal tract morphology. Results : In lower larynx position, CQ is increased 9.8% and Intensity is increased about 10% and level of Formant Frequency is increased. And also Vocal tract length is longer 2.4cm, Vocal tract width(Anterior width : 0.4cm, lateral width : 0.2cm) is wider than in higher larynx position. Conclusions : Singer's formant has a prominent spectrum envelope peak near 2400-2600Hz by clustering of F3, F4 and F5 near 3400Hz in lower larynx position.
PDF

Recognition of Korean Isolated Digits Using a Pole-Zero Model (Polo-Zero 모델을 이용한 한국어 단독 숫자음 인식)

;;Alan Conrad Bovik
- Journal of the Korean Institute of Telematics and Electronics
- /
- v.25 no.4
- /
- pp.356-365
- /
- 1988
In this paper, we describe an isolated words recognition system for Korean isolated digits based on a voiced -unvoiced decision algorithm and a frequency domain analysis. The algorithm first performs a voiced-unvoiced decision procedure for the begtinning part of each uttered work using the normalized log energy and zero crossing rate as decision parameters. Based on this decision,. each word is assigned to one of two classes. In order to identify the uttered word within each class, a dynamic time warping algorithm is applied using formant frequencies as the basis for the distance measure. We exploit a pole-zero analysis to measure formant frequencies in each frame. We have observed that pole-zero analysis can provide more accurate estimation of formant frequencies than analysis based on poles only. Experimental recognition rates of 97.3% illustrating the performance of the recognition system was achieved.
PDF

A Study on the Correlation Between Sasang Constitution and Sound Characteristics Used Harmonics and Formant Bandwidth (Harmonics(배음)와 Formant Bandwidth(포먼트 폭)를 이용한 음성특성(音聲特性)과 사상체질간(四象體質間)의 상관성(相關性) 연구(硏究))

Park, Sung-Jin;Kim, Dal-Rae
- Journal of Sasang Constitutional Medicine
- /
- v.16 no.1
- /
- pp.61-73
- /
- 2004
This study was prepared to investigate the correlation between Sasang constitutional groups and voice characteristics using voice analysis system(in this study, CSL). I focused on the voice characteristics in terms of harmonics, Formant frequency and Formant Bandwidth. The subjects were 71 males. I classified them into three groups, that is Soeumin group, Soyangin group and Taeumin group. The classification method of Constitution used two ways, QSCCII(Questionnarie for the Sasang Constitution Classification II) and Interview with a specialist in Sasang Constitution. So 71 people were categorized into 31 Soeumin(people), 18 Soyangin(people) and 22 Taeumin(people). Pitch is approximately similar to the fundamental frequency(F0) in voices. Shimmer in dB gives an evaluation of the period-to-period variability of the peak-to-peak amplitude within the analyzed voice sample. FFT(Fast Fourier Transform) method in CSL can display sampled voices into harmonics. H1 is the first peak and h2 is the second peak in the harmonics. The amplitude difference of h1 and h2(h1-h2) can be explained as the speaker's phonation type, And Formant frequency and bandwidth can be explained as the speaker's vocal tract. So I checked the harmonics and Formant frequency and Bandwidth as the voice parameters. First I have captured /e/ voices from all subjects using microphone. And then I analyzed /e/ voices with CSL. Power Spectrum and Formant History is the menu in the CSL which can display harmonics and Formant frequency and bandwidth. The results about the correlation between Sasang Constitutional Groups and voice parameters are as follows; 1. There is no significant amplitude difference of harmonics(h1-h2) among three groups. 2. There is the significant difference between Soeumin Group and Soyangin Group in Formant Frequency 1 and Formant Bandwidth 1(p<0.05). Any other parameters have no significance. I assume that Soyangin Group has clearer and brighter voice than Soeumin Group according to the Formant Bandwidth difference. And I think its result has coincidence with the context of "Dongyi Suse Bowon" and "Sasangimhejinam".
PDF

Sound Analysis of Cleft Platate Patinents Using Formant Position (포르만트 위치비교를 이용한 구개열 환자의 발음분석)

김덕원;송철규
- Journal of Biomedical Engineering Research
- /
- v.11 no.2
- /
- pp.283-288
- /
- 1990
As one of the main purpose of the physical management of cleft palate is to provide for the anatomic and physiologic requisites for speech, the speech must be as one of the criteria for determining when physical management has been achieved. But there is no objective methods to evaluate the speech of cleft palate patients. The authors tried to analyze the speech of adult cleft palate patients using sound spectrog raphy and compared with normal adults. The results were obtained as follows ; 1. In Vowels, cleft palate patients of both sexes showed reduction of frequency of the first and second formant as compared to normal. There was minimal difference in front vowels (i, e, ae) 2. In consonants, cleft palate patients showed reduction of frequency of the first formant in both sexes but reduction of frequency of the second formant was noticed only in fe- male patients. 3. There was no statistical difference in sound spectrograph between plosive, fricative, africative, nasal, and glide consonants.
PDF

A study about five-sounds(Gong, Sang, jiao, zhi, yu) of Sasang constitutional sound analysis (오음의 사상의학적 음성분석과 고찰)

Kim, Dal-Rea
- Journal of Sasang Constitutional Medicine
- /
- v.15 no.1
- /
- pp.50-59
- /
- 2003
Purpose Five animals sounds which are come under five sounds(Gong, Sang, jiao, zhi, yu) which are compared with the musical scale. It is looking for similarity between five animals' sounds and the musical scale. Methods 녹음 record 1 ig machine 1. Five animals (cattle, horse, pheasant, pig, sheep) sounds has been recording on tape. 2. That was transfer to CSL(computerized speech lab) 3. That was analysed to pitch, formant 1,2,3. energy pitch 4. That analysed result (Pitch, formant 1,2,3. energy ratio) of five animals are calculated and compared with the five musical scale(five sounds) Result The ratio of five animals sounds is not consistent with the musical scale in any five item (pitch, formant 1,2,3. energy). Conclusion 1.The five musical scale has no similarity with the five animals sounds 2.The five sound is supposed to oriented form theoretical back ground of five-going not have no relative with the five animals sounds
PDF

Search Result 191, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)