• Title/Summary/Keyword: formant parameters

Search Result 74, Processing Time 0.018 seconds

Efficient Tracking of Speech Formant Using Closed Phase WRLS-VFF-VT Algorithm

  • Lee, Kyo-Sik;Park, Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.2E
    • /
    • pp.8-13
    • /
    • 2000
  • In this paper, we present an adaptive formant tracking algorithm for speech using closed phase WRLS-VFF-VT method. The pitch synchronous closed phase methods is known to give more accurate estimates of the vocal tract parameters than the pitch asynchronous method. However the use of a pitch-synchronous closed phase analysis method has been limited due to difficulties associated with the task of accurately isolating the closed phase region in successive periods of speech. Therefore we have implemented the pitch synchronous closed phase WRLS-VFF-VT algorithm for speech analysis, especially for formant tracking. The proposed algorithm with the variable threshold(VT) can provide a superior performance in the boundary of phone and voiced/unvoiced sound. The proposed method is experimentally compared with the other method such as two channel CPC method by using synthetic waveform and real speech data. From the experimental results, we found that the block data processing techniques, such as the two-channel CPC, gave reasonable estimates of the formant/antiformant. However, the data windows used by these methods included the effects of the periodic excitation pulses, which affected the accuracy of the estimated formants. On the other hand the proposed WRLS-VFF-VT method, which eliminated the influence of the pulse excitation by using an input estimation as part of the algorithm, gave very accurate formant/bandwidth estimates and good spectral matching.

  • PDF

An Analysis of Phonetic Parameters for Individual Speakers (개별화자 음성의 특징 파라미터 분석)

  • Ko, Do-Heung
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.177-189
    • /
    • 2000
  • This paper investigates how individual speakers' speech can be distinguished using acoustic parameters such as amplitude, pitch, and formant frequencies. Word samples from fifteen male speakers in their 20's in three different regions were recorded in two different modes (i.e., casual and clear speech) in quiet settings, and were analyzed with a Praat macro scrip. In order to determine individual speakers' acoustical values, the total duration of voicing segments was measured in five different timepoints. Results showed that a high correlation coefficient between $F_1\;and\;F_2$ in formant frequency was found among the speakers although there was little correlation coefficient between amplitude and pitch. Statistical grouping shows that individual speakers' voices were not reflected in regional dialects for both casual and clear speech. In addition, the difference of maximum and minimum in amplitude was about 10 dB which indicates a perceptually audible degree. These acoustic data can give some meaningful guidelines for implementing algorithms of speaker identification and speaker verification.

  • PDF

Recognition of Korean Isolated Digits Using a Pole-Zero Model (Polo-Zero 모델을 이용한 한국어 단독 숫자음 인식)

  • ;;Alan Conrad Bovik
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.25 no.4
    • /
    • pp.356-365
    • /
    • 1988
  • In this paper, we describe an isolated words recognition system for Korean isolated digits based on a voiced -unvoiced decision algorithm and a frequency domain analysis. The algorithm first performs a voiced-unvoiced decision procedure for the begtinning part of each uttered work using the normalized log energy and zero crossing rate as decision parameters. Based on this decision,. each word is assigned to one of two classes. In order to identify the uttered word within each class, a dynamic time warping algorithm is applied using formant frequencies as the basis for the distance measure. We exploit a pole-zero analysis to measure formant frequencies in each frame. We have observed that pole-zero analysis can provide more accurate estimation of formant frequencies than analysis based on poles only. Experimental recognition rates of 97.3% illustrating the performance of the recognition system was achieved.

  • PDF

Voice Similarities between Brothers

  • Ko, Do-Heung;Kang, Sun-Mee
    • Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.1-11
    • /
    • 2002
  • This paper aims to provide a guideline for modelling speaker identification and speaker verification by comparing voice similarities between brothers. Five pairs of brothers who are believed to have similar voices participated in this experiment. Before conducted in the experiment, perceptual tests were measured if the voices were similar between brothers. The words were measured in both isolation and context, and the subjects were asked to read five times with about three seconds of interval between readings. Recordings were made at natural speed in a quiet room. The data were analyzed in pitch and formant frequencies using CSL (Computerized Speech Lab), PCQuirer and MDVP (Multi -dimensional Voice Program). It was found that data of the initial vowels are much more similar and homogeneous than those of vowels in other position. The acoustic data showed that voice similarities are strikingly high in both pitch and formant frequencies. It was also found that the correlation coefficient was not significant between parameters above.

  • PDF

How to Express Emotion: Role of Prosody and Voice Quality Parameters (감정 표현 방법: 운율과 음질의 역할)

  • Lee, Sang-Min;Lee, Ho-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.11
    • /
    • pp.159-166
    • /
    • 2014
  • In this paper, we examine the role of emotional acoustic cues including both prosody and voice quality parameters for the modification of a word sense. For the extraction of prosody parameters and voice quality parameters, we used 60 pieces of speech data spoken by six speakers with five different emotional states. We analyzed eight different emotional acoustic cues, and used a discriminant analysis technique in order to find the dominant sequence of acoustic cues. As a result, we found that anger has a close relation with intensity level and 2nd formant bandwidth range; joy has a relative relation with the position of 2nd and 3rd formant values and intensity level; sadness has a strong relation only with prosody cues such as intensity level and pitch level; and fear has a relation with pitch level and 2nd formant value with its bandwidth range. These findings can be used as the guideline for find-tuning an emotional spoken language generation system, because these distinct sequences of acoustic cues reveal the subtle characteristics of each emotional state.

The change of vowel characteristics for the dysarthric speech along with speaking style (경도 마비말장애 환자의 발화 유형에 따른 모음 특성 비교)

  • Kim, Jiyoun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.51-59
    • /
    • 2016
  • The purpose of present study is to examine differences between habitual speech (HS) and clear speech (CS) in individuals with mild dysarthria. Twelve speakers with mild dysarthria and twelve healthy control speakers read sentences in two speaking styles. Formant and intensity related values, triangular area, and center of gravity of /a/, /i/, and /u/ were measured. In addition, formant-ratio variables such as vowel space area(VSA), vowel articulatory index (VAI), formant centralization ratio (FCR) and F2i/F1u ratio (F2 ratio) were calculated. The results of repeated-measures ANOVA showed a significant difference in F2 of vowel /i/ and F2 energy of vowel /a/ between groups. Regarding formant energy, F2 energy of vowel /a/ were observed as meaningful variables between speaking styles. There were significant speaking style-by-group interactions for F2 energy of vowel /a/. These findings indicated that current parameters could discriminate healthy group and mild dysarthria group meaningfully and that speaker with dysarthria had larger clear speech benefit than healthy talkers. We also claim that various acoustic changes of clear speech may contribute to improving vowel intelligibility.

A Study on the Formant Analysis of Korean Monophthongs and their Resonance Effect in Vocal Tract (한글 단모음의 포만트 분석과 성도내의 공명효과에 관한 연구)

  • Sin, Hyeon-Jae;Yun, Seok-Wang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.2
    • /
    • pp.30-37
    • /
    • 1987
  • Twelve Korean monophthongs were studied by formant analysis, fundamental frequencies and their harmonics were considered as the parameters of analysis. The analyzed data were twelve Korean monophthongs which were pronounced with the five fundamental frequencies by the five male vocal musicians. The study shows that the first and the second formants are characterized by the resonance of the cavities of pharymx and mouth, respectively. The lip rounding effect detreases the second formant frequency. The phonemes of $[a]/[\alpha ], [e]/[\varepsilon] and [\partial]/[\Lambda]$were not distinguished well in this formant analysis.

  • PDF

Nasometric and Acoustic Analysis in Experimentally Induced Velopharyngeal Insufficiency in Human (사람에서 유발시킨 구개인두부전증의 비음도와 음향학적 분석)

  • 윤자복;성명훈;정원호;김광현
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.8 no.2
    • /
    • pp.210-216
    • /
    • 1997
  • Many tools have been used to evaluate the voice abnormalities of velopharyngeal insufficiency(VPI). The aim of study was to obtain the objective evaluation method of VPI by comparing the acoustic and nasalance data of experimentally induced VPI group and those of normal control group. Ten healthy young men were included in this study Mild and severe VPI were experimentally induced by retracting velopharyngeal movement. Using the nasometer, we obtained the nasalance score of the sustained oral vowels and those of three types of nasometer passages and the slope scores of nasogram of nasal words. And we analysed the change of formant frequencies for the sustained oral vowels and the changes of various parameters of hyper-tnasality by the computerized speech analysis system. The nasalance score of sustained /a/ was increased significantly in VPI conditions. There was no changes in the slope score of nasogram. On the acoustic speech analysis, the second formant frequencies of vowel /e/ and /i/ were decreased significantly in VPI conditions. This results suggested that the measurement of nasalance score and formant frequency might be useful in the evaluation of VPI.

  • PDF

An Experiment of a Spoken Digits-Recognition System (숫자음성 자동 인식에 관한 일실험)

  • ;安居院猛
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.15 no.6
    • /
    • pp.23-28
    • /
    • 1978
  • This paper describes a speech recognition system for ten isolated spoken digits. In this system, acoustic parameters such as zero crossing rate, log energy and three formant frequencies estimated by linear prediction method were extracted for classification and/or recognition purpose(s). The former two parameters were used for the classification of unvoiced consonants and the latter one for the recognition of vowels and voiced consonants. Promising recognition results were obtained in this experiment for ten digit utterances spoken by a male speaker.

  • PDF

Analysis and Comparisons of Acoustical Characteristics of Pathologic Voice before and after Surgery (후두질환에 대한 술전 술후 음성의 음향적 특성비교 분석)

  • Kim, Dae-Hyun;Jo, Cheol-Woo;Baek, Moo- Jin;Wang, Soo-Geun
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.285-294
    • /
    • 2000
  • In this paper the acoustic characteristics of pathological voice, which are measured before and after surgical operation, are compared. This experiment is conducted for the purpose of predicting patients' speech after operation. The voices are recorded from the same patients. Jitter, shimmer and other parameters are. computed and their statistical characteristics are compared. Also spectral changes, such as formant frequency shift and spectral slope change, are compared. From the experimental results, it is verified that not only source characteristics but also vocal tract components vary. And this indicates that the modification of source parameters are not enough for the prediction. Also the result indicates that the operation causes change to both the physical shape of vocal folds and the manner of articulation.

  • PDF