• 제목/요약/키워드: Formant Analysis

검색결과 191건 처리시간 0.027초

Efficient Tracking of Speech Formant Using Closed Phase WRLS-VFF-VT Algorithm

  • Lee, Kyo-Sik;Park, Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • 제19권2E호
    • /
    • pp.8-13
    • /
    • 2000
  • In this paper, we present an adaptive formant tracking algorithm for speech using closed phase WRLS-VFF-VT method. The pitch synchronous closed phase methods is known to give more accurate estimates of the vocal tract parameters than the pitch asynchronous method. However the use of a pitch-synchronous closed phase analysis method has been limited due to difficulties associated with the task of accurately isolating the closed phase region in successive periods of speech. Therefore we have implemented the pitch synchronous closed phase WRLS-VFF-VT algorithm for speech analysis, especially for formant tracking. The proposed algorithm with the variable threshold(VT) can provide a superior performance in the boundary of phone and voiced/unvoiced sound. The proposed method is experimentally compared with the other method such as two channel CPC method by using synthetic waveform and real speech data. From the experimental results, we found that the block data processing techniques, such as the two-channel CPC, gave reasonable estimates of the formant/antiformant. However, the data windows used by these methods included the effects of the periodic excitation pulses, which affected the accuracy of the estimated formants. On the other hand the proposed WRLS-VFF-VT method, which eliminated the influence of the pulse excitation by using an input estimation as part of the algorithm, gave very accurate formant/bandwidth estimates and good spectral matching.

  • PDF

한국어 대용량발화말뭉치의 단모음분석 (Monophthong Analysis on a Large-scale Speech Corpus of Read-Style Korean)

  • 윤태진;강윤정
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.139-145
    • /
    • 2014
  • The paper describes methods of conducting vowel analysis from a large-scale corpus with the aids of forced alignment and optimal formant ceiling methods. 'Read Style Corpus of Standard Korean' is used for building the forced alignment system and a subset of the corpus for the processing and extraction of features for vowel analysis based on optimal formant ceiling. The results of the vowel analysis are reliable and comparable to the results obtained using traditional analytical methods. The findings indicate that the methods adopted for the analysis can be extended and be used for more fine-grained analysis without time-consuming manual labeling without losing accuracy and reliability.

정상 모음에 대한 구강 및 비강 spectral output 분석 (Oral and Nasal Spectral Outputs in Korean Oral Vowels)

  • 홍기환;최승철;김범규;양윤수;심현아
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.145-157
    • /
    • 2003
  • Vowels are classified by the shapes of vocal tract. These shapes form constriction points along the tract, which have an influence on such vocal tract resonance as F1, F2, F3, and so on. The formant frequency is influenced by aperture and placement of tongue and the intensity is influenced by air pressure of subglottis. The object of this study compares to characterize the spectral outputs of oral and nasal spectra for the formant frequencies and intensity of Korean oral vowels. Subjects consisted of 20 normal persons (10 male and 10 female) without laryngeal pathology. The speech sample included /a/, /e/, /i/, /o/, /u/ of Korean oral vowels. The spectrum of each vowel was analysed by Nasal View and Real Analysis Program using Dr. Speech. The result showed that nasal intensity is decreased manifestly from F1 to F2. But oral intensity and Intensity is decreased little bit from F1 to F2. The most of values of nasal formant frequency is similarity oral formant frequency and Formant frequency or little bit smaller.

  • PDF

한글 단모음의 포만트 분석과 성도내의 공명효과에 관한 연구 (A Study on the Formant Analysis of Korean Monophthongs and their Resonance Effect in Vocal Tract)

  • 신현재;윤석왕
    • 한국음향학회지
    • /
    • 제6권2호
    • /
    • pp.30-37
    • /
    • 1987
  • 한글 단모음의 음향특성을 기본진동수와 배진동수를 고러하여 포만트 분석하였으며, 성도내의 공명현상과 포만트진동수와의 상관관계를 고찰하였다. 성악을 전공하는 남성으로 하여금 한글 단모음 12개를 5개의 기본진동수에 맞추어 3초 동안 발음하게 하여 FFT스펙트럼 분석기를 통해 진동수 스펙트럼을 얻었다. 포만트 분석에 의해 제 1포만트는 인두강, 그리고 제 2포만트는 구강의 공명효과에 의함을 밝혔고, 원순화가 일어나므로서 제 2포만트 진동수가 낮아짐을 발견하였다. 제 1포만트와 제 2포만트 진동수로는 "어"의 $[\partial]와[\Lambda], "아"의[a]와 [\alpha]$, 그리고"에"와 "애"의 음향학적 차이를 뚜렷이 구분짓기는 어러웠다

  • PDF

사람에서 유발시킨 구개인두부전증의 비음도와 음향학적 분석 (Nasometric and Acoustic Analysis in Experimentally Induced Velopharyngeal Insufficiency in Human)

  • 윤자복;성명훈;정원호;김광현
    • 대한후두음성언어의학회지
    • /
    • 제8권2호
    • /
    • pp.210-216
    • /
    • 1997
  • Many tools have been used to evaluate the voice abnormalities of velopharyngeal insufficiency(VPI). The aim of study was to obtain the objective evaluation method of VPI by comparing the acoustic and nasalance data of experimentally induced VPI group and those of normal control group. Ten healthy young men were included in this study Mild and severe VPI were experimentally induced by retracting velopharyngeal movement. Using the nasometer, we obtained the nasalance score of the sustained oral vowels and those of three types of nasometer passages and the slope scores of nasogram of nasal words. And we analysed the change of formant frequencies for the sustained oral vowels and the changes of various parameters of hyper-tnasality by the computerized speech analysis system. The nasalance score of sustained /a/ was increased significantly in VPI conditions. There was no changes in the slope score of nasogram. On the acoustic speech analysis, the second formant frequencies of vowel /e/ and /i/ were decreased significantly in VPI conditions. This results suggested that the measurement of nasalance score and formant frequency might be useful in the evaluation of VPI.

  • PDF

후두위치의 변화에 따른 Singer's Formant와 성대접촉률의 변화 연구 (Analysis of Singer's Formant & Close Quotient During Change of the Larynx Position)

  • 남도현;최성희;최재남;전석필;최홍식
    • 대한후두음성언어의학회지
    • /
    • 제15권2호
    • /
    • pp.98-111
    • /
    • 2004
  • Background and Objectives : The purpose of this study is to analyze the difference of Fundamental Frequency(Hz), Closed Quotient(Qx ; %), Intensity(dB), Vocal tract length and width(cm), formant frequency(Hz), level of formant frequency(dB) depending on the larynx position. Materials and Methods : One professional male singer(career : 28 years) produced sustained vowel /a/,/e/,/i/,/o/,/u/ in two larynx position (higher, lower) with Dr. Speech and video fluoroscopy was used to quantify the vocal tract morphology. Results : In lower larynx position, CQ is increased 9.8% and Intensity is increased about 10% and level of Formant Frequency is increased. And also Vocal tract length is longer 2.4cm, Vocal tract width(Anterior width : 0.4cm, lateral width : 0.2cm) is wider than in higher larynx position. Conclusions : Singer's formant has a prominent spectrum envelope peak near 2400-2600Hz by clustering of F3, F4 and F5 near 3400Hz in lower larynx position.

  • PDF

Polo-Zero 모델을 이용한 한국어 단독 숫자음 인식 (Recognition of Korean Isolated Digits Using a Pole-Zero Model)

  • 김순협;박규태
    • 대한전자공학회논문지
    • /
    • 제25권4호
    • /
    • pp.356-365
    • /
    • 1988
  • In this paper, we describe an isolated words recognition system for Korean isolated digits based on a voiced -unvoiced decision algorithm and a frequency domain analysis. The algorithm first performs a voiced-unvoiced decision procedure for the begtinning part of each uttered work using the normalized log energy and zero crossing rate as decision parameters. Based on this decision,. each word is assigned to one of two classes. In order to identify the uttered word within each class, a dynamic time warping algorithm is applied using formant frequencies as the basis for the distance measure. We exploit a pole-zero analysis to measure formant frequencies in each frame. We have observed that pole-zero analysis can provide more accurate estimation of formant frequencies than analysis based on poles only. Experimental recognition rates of 97.3% illustrating the performance of the recognition system was achieved.

  • PDF

Harmonics(배음)와 Formant Bandwidth(포먼트 폭)를 이용한 음성특성(音聲特性)과 사상체질간(四象體質間)의 상관성(相關性) 연구(硏究) (A Study on the Correlation Between Sasang Constitution and Sound Characteristics Used Harmonics and Formant Bandwidth)

  • 박성진;김달래
    • 사상체질의학회지
    • /
    • 제16권1호
    • /
    • pp.61-73
    • /
    • 2004
  • This study was prepared to investigate the correlation between Sasang constitutional groups and voice characteristics using voice analysis system(in this study, CSL). I focused on the voice characteristics in terms of harmonics, Formant frequency and Formant Bandwidth. The subjects were 71 males. I classified them into three groups, that is Soeumin group, Soyangin group and Taeumin group. The classification method of Constitution used two ways, QSCCII(Questionnarie for the Sasang Constitution Classification II) and Interview with a specialist in Sasang Constitution. So 71 people were categorized into 31 Soeumin(people), 18 Soyangin(people) and 22 Taeumin(people). Pitch is approximately similar to the fundamental frequency(F0) in voices. Shimmer in dB gives an evaluation of the period-to-period variability of the peak-to-peak amplitude within the analyzed voice sample. FFT(Fast Fourier Transform) method in CSL can display sampled voices into harmonics. H1 is the first peak and h2 is the second peak in the harmonics. The amplitude difference of h1 and h2(h1-h2) can be explained as the speaker's phonation type, And Formant frequency and bandwidth can be explained as the speaker's vocal tract. So I checked the harmonics and Formant frequency and Bandwidth as the voice parameters. First I have captured /e/ voices from all subjects using microphone. And then I analyzed /e/ voices with CSL. Power Spectrum and Formant History is the menu in the CSL which can display harmonics and Formant frequency and bandwidth. The results about the correlation between Sasang Constitutional Groups and voice parameters are as follows; 1. There is no significant amplitude difference of harmonics(h1-h2) among three groups. 2. There is the significant difference between Soeumin Group and Soyangin Group in Formant Frequency 1 and Formant Bandwidth 1(p<0.05). Any other parameters have no significance. I assume that Soyangin Group has clearer and brighter voice than Soeumin Group according to the Formant Bandwidth difference. And I think its result has coincidence with the context of "Dongyi Suse Bowon" and "Sasangimhejinam".

  • PDF

포르만트 위치비교를 이용한 구개열 환자의 발음분석 (Sound Analysis of Cleft Platate Patinents Using Formant Position)

  • 김덕원;송철규
    • 대한의용생체공학회:의공학회지
    • /
    • 제11권2호
    • /
    • pp.283-288
    • /
    • 1990
  • As one of the main purpose of the physical management of cleft palate is to provide for the anatomic and physiologic requisites for speech, the speech must be as one of the criteria for determining when physical management has been achieved. But there is no objective methods to evaluate the speech of cleft palate patients. The authors tried to analyze the speech of adult cleft palate patients using sound spectrog raphy and compared with normal adults. The results were obtained as follows ; 1. In Vowels, cleft palate patients of both sexes showed reduction of frequency of the first and second formant as compared to normal. There was minimal difference in front vowels (i, e, ae) 2. In consonants, cleft palate patients showed reduction of frequency of the first formant in both sexes but reduction of frequency of the second formant was noticed only in fe- male patients. 3. There was no statistical difference in sound spectrograph between plosive, fricative, africative, nasal, and glide consonants.

  • PDF

오음의 사상의학적 음성분석과 고찰 (A study about five-sounds(Gong, Sang, jiao, zhi, yu) of Sasang constitutional sound analysis)

  • 김달래
    • 사상체질의학회지
    • /
    • 제15권1호
    • /
    • pp.50-59
    • /
    • 2003
  • Purpose Five animals sounds which are come under five sounds(Gong, Sang, jiao, zhi, yu) which are compared with the musical scale. It is looking for similarity between five animals' sounds and the musical scale. Methods 녹음 record 1 ig machine 1. Five animals (cattle, horse, pheasant, pig, sheep) sounds has been recording on tape. 2. That was transfer to CSL(computerized speech lab) 3. That was analysed to pitch, formant 1,2,3. energy pitch 4. That analysed result (Pitch, formant 1,2,3. energy ratio) of five animals are calculated and compared with the five musical scale(five sounds) Result The ratio of five animals sounds is not consistent with the musical scale in any five item (pitch, formant 1,2,3. energy). Conclusion 1.The five musical scale has no similarity with the five animals sounds 2.The five sound is supposed to oriented form theoretical back ground of five-going not have no relative with the five animals sounds

  • PDF