• Title/Summary/Keyword: fundamental frequency of speech

Search Result 205, Processing Time 0.018 seconds

Acoustic Analysis of Reinke Edema (라인케부종환자의 음성분석)

  • 김상균;최홍식;공석철;홍원표
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.7 no.1
    • /
    • pp.11-19
    • /
    • 1996
  • Reinke's edema is used for describing varying degrees of chronic swelling of the vocal folds. The acoustic analysis of Reinke's edema has not been reported so far in this country. The purpose of this study is to clarify acoustic and aerodynamic characteristics of the Reinke's edema. Several acoustic evaluations & aerodynamic studies were done in 20 Reinke's edema patients and the data was compared with those of 20 normal controls. Videolaryngoscopy also was done to classify the severity in grading. We used C-Speech, Doctor speech science, and Phonatory function analyser. In C-Speech, we compared jitter, shimmer, and SNR(signal to noise ratio) of normal and Rrinke's edema patient. In Doctor speech science, we compared NNE(Glottal noise energy), speech fundamental frequency, voice quality between two groups. And in phonatory function analyser for aerodynamic function test, we compared speech intensity, airflow rate, and expiratory pressure between two groups. In conclusion, Reinke's edema patients showed lower voice pitches than normal, additionally jitter, shimmer, SNR(signal to noise ratio), NNE(Glottal noise energy), airflow rate, and expiratory pressure may be meaningful parameters for diagnosis and prognosis for treatment.

  • PDF

Korean Forced Sounds Revisited (경음재론(硬音再論))

  • Umeda, Hiroyuki
    • Speech Sciences
    • /
    • v.6
    • /
    • pp.25-32
    • /
    • 1999
  • Not only Korean scholars but also many scholars in the field of phonetic science were interested in Korean forced sounds from the viewpoint of general phonetics as a way to clarify the process of production and the physical characteristics of speech sounds. The author also tried to elucidate the characteristics of Korean forced sounds using the sound spectrograph (Umeda & Umeda 1965). Over the past 30 years since this study, many scholars have analyzed these sounds from various standpoints resulting in copious amounts of literature. In this paper, the author critically examined the results of previous studies on this subject dealing with VOT, spectral characteristics, fundamental frequency, states of glottis obtained from fiber-optic investigations, and patterns of tongue-palate contact acquired from observation by use of dynamic palatograph. These points were discussed in relation to the author's field of investigation.

  • PDF

The Effects of Vertical Laryngeal Movements on the Vocal Folds (후두 수직운동이 성대에 미치는 영향)

  • Hong, Ki-Hwan;Kim, Hyun-Ki
    • Speech Sciences
    • /
    • v.1
    • /
    • pp.261-274
    • /
    • 1997
  • In spite of the presumed importance of the strap muscles on laryngeal valving and speech production, there is little research concerning the physiological role and the functional differences among the strap muscles. Generally, the strap muscles have been shown to cause a decrease in the fundamental frequency(Fo) of phonation during contraction. In this study, an in vivo canine laryngeal model was used to show the effects of strap muscles on the laryngeal function by measuring the Fo, subglottal pressure, vocal intensity, vocal fold length, cricothyroid distance, and vertical laryngeal movement. Results demonstrated that the contraction of sternohyoid and sternothyroid muscles corresponded to a rise in subglottal pressure, shortened cricothyroid distance, lengthened vocal fold, and raised Fo and vocal intensity. The thyrohyoid muscle corresponded to lowered subglottal pressure, widened cricothyroid distance, shortened vocal fold, and lowered Fo and vocal intensity. It was postulated that the mechanism of altering Fa and other variables after stimulation of the strap muscles is due to the effects of laryngotracheal pulling, upward or downward, and laryngotracheal forward bending, by the external forces during strap muscle contraction.

  • PDF

Developing Sample Sentences for Voice Assessment of Koreans: A Preliminary Study (한국인을 위한 음도진단용 표준문장개발연구: 예비측정 및 분석결과)

  • Kim Soo-Jin;Moon Seung-Jae;Shin Jiyoung
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.109-112
    • /
    • 2003
  • This paper aims at developing sample paragraphs for voice pitch assessment which is specifically designed for Koreans. Recently the demand for such a battery of sample sentences has been steadily increased among Korean speech therapists. In this paper, different sample paragraphs (two conventionally used paragraphs and three newly developed ones which consist mainly of sonorant sounds and different types of sentences), different softwares (Dr. Speech, Wavesurfer, Praat) and different techniques (automatic measurement and detailed measurement in which the researcher controls many aspects which might influence the measurement of pitch) will be compared for measuring fundamental frequency.

  • PDF

Multi-dimensional Representation and Correlation Analyses of Acoustic Cues for Stops (폐쇄음 음향 단서의 다차원 표현과 상관관계 분석)

  • Yun, Weon-Hee
    • MALSORI
    • /
    • v.55
    • /
    • pp.45-60
    • /
    • 2005
  • The purpose of this paper is to represent values of acoustic cues for Korean oral stops in the multi-dimensional space, and to attempt to find possible relationships among acoustic cues through correlation analyses. The acoustic cues used for differentiation of 3 types of Korean stops are closure duration, voice onset time and fundamental frequency of a vowel after a stop. The values of these cues are plotted in the two and three dimensional space to see what the critical cues are for separation of different types of stops. Correlation coefficient analyses show that multi-variate approach to statistical analysis is legitimate, and that there are statistically significant relationships among acoustic cues but Oey are not strong enough to make the conjecture that there is a possible relationship among the articulatory or laryngeal mechanisms employed by the acoustic cues.

  • PDF

SWAPPING NATIVE AND NON-NATIVE SPEAKERS' PROSODY USING THE PSOLA ALGORITHM

  • Yoon Kyu-Chul
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.77-81
    • /
    • 2006
  • This paper presents a technique of imposing the prosodic features of a native speaker's utterance onto the same sentence uttered by a non-native speaker. Three acoustic aspects of the prosodic features were considered: the fundamental frequency (F0) contour, segmental durations, and the intensity contour. The fundamental frequency contour and the segmental durations of the native speaker's utterance were imposed on the non-native speaker's utterance by using the PSOLA (pitch-synchronous overlap and add) algorithm [1] implemented in Praat[2]. The intensity contour transfer was also done in Praat. The technique of transferring one or more of these prosodic features was elaborated and its implications in the area of language education were discussed.

  • PDF

F0 as a primary cue for signaling word-initial stops of Seoul Korean (서울 방언 어두 폐쇄음의 후속모음 F0)

  • Byun, Hi-Gyung
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.25-36
    • /
    • 2016
  • Previous studies showed that the voice onset time (VOT) of aspirated and lenis stops has been merged, and post-stop fundamental frequency (F0) has emerged as a primary cue to distinguish the two stops in the younger generation and female speech. The purpose of this study is to demonstrate that VOT merger in aspirated and lenis stops occurs after an F0 difference between the two stops becomes stabilized. In other words, unless post-stop F0, which is a redundant feature, is fully developed, it is hard for VOT merger to happen. Females have got a stable F0 difference in stops earlier than males. Therefore, VOT merger could happen, and as a result, females could take the lead in changing from VOT to F0 in initial stops. This study also shows that speakers who acquired F0 as a primary cue use F0 to the full to distinguish lenis stops from two other stops (aspirated and fortis).

The Acoustic Severity Index in the Pathologic Voice (음성장애에 대한 음향학적 중등도 지표)

  • Hong, Ki-Hwan;Kim, Hyun-Ki;Yang, Yoon-Soo
    • Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.201-219
    • /
    • 2003
  • Background: The perceptual assessment is generally performed by the voice specialist. The objective evaluation is performed in a voice laboratory. Research in voice laboratories has generated a variety of different objective tests and parameters. The perceptual evaluation is one of the most controversial topics in voice research. Review of literature reveals a wide variety of rating scales and reliability data fluctuating from study to study. Unfortunately, there is no widely accepted valid method for classifying voice disorders and assessing outcome after voice treatment. Objectives: The goals of this research were to identify important objective acoustic parameters of vocal quality, and to establish an objective and quantitative correlate of the perceived vocal quality. Materials and Methods : We evaluated the voice analyzed data from 122 dysphonic patients and 20 normal volunteers. A computerized speech lab. 4300B(CSL) was used to carry out the analysis of each voice sample. Results: Three dysphonia severity indices(DSI) were created using discriminant analysis. DSI is based on the weighted combination of the following selected set of acoustic parameters: absolute jitter(Jita in us), smoothed pitch period perturbation (sPPQ in %), amplitude perturbation quotient(APQ in %), soft phonation index(SPI), average fundamental frequency(Fo in Hz), lowest fundamental frequency(Flo in Hz), and smoothed amplitude perturbation quotient(sAPQ in %). The DSI, being the discriminating rule calculated by the logistic regression, consists of three equation based on statistically significant acoustic parameters. Three DSI were created to reflects best the degree of hoarseness as expressed by G from the GRBAS scale. The more positive this DSI is for a patient, the worse the vocal quality. The more it is negative, the better it is. The effect of sex is included implicitly in the DSI-1 and DSI-2, so that a separate DSI-1 and DSI-2 for males and females need not be used. The DSI is objective because no perceptual input is required for its calculation. Conculsion : This research demonstrates that the voice function values calculated from three different multivariate objective dysphonia severity indices are significantly associated with subjective voice assessments. These multivariate objective dysphonia severity indices may be appropriate for use in clinical trials and outcomes research on treatment effectiveness for voice disorders.

  • PDF

A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System (가변 Break를 이용한 코퍼스 기반 일본어 음성 합성기의 성능 향상 방법)

  • Na, Deok-Su;Min, So-Yeon;Lee, Jong-Seok;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.155-163
    • /
    • 2009
  • In text-to-speech systems, the conversion of text into prosodic parameters is necessarily composed of three steps. These are the placement of prosodic boundaries. the determination of segmental durations, and the specification of fundamental frequency contours. Prosodic boundaries. as the most important and basic parameter. affect the estimation of durations and fundamental frequency. Break prediction is an important step in text-to-speech systems as break indices (BIs) have a great influence on how to correctly represent prosodic phrase boundaries, However. an accurate prediction is difficult since BIs are often chosen according to the meaning of a sentence or the reading style of the speaker. In Japanese, the prediction of an accentual phrase boundary (APB) and major phrase boundary (MPB) is particularly difficult. Thus, this paper presents a method to complement the prediction errors of an APB and MPB. First, we define a subtle BI in which it is difficult to decide between an APB and MPB clearly as a variable break (VB), and an explicit BI as a fixed break (FB). The VB is chosen using the classification and regression tree, and multiple prosodic targets in relation to the pith and duration are then generated. Finally. unit-selection is conducted using multiple prosodic targets. In the MOS test result. the original speech scored a 4,99. while proposed method scored a 4.25 and conventional method scored a 4.01. The experimental results show that the proposed method improves the naturalness of synthesized speech.

Acoustic Characteristics of Some Vowels Produced by the CI Children of Various Age Groups (인공와우 이식 시기에 따른 모음의 음향음성학적 특성)

  • Kim, Go-Eun;Ko, Do-Heung
    • Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.203-212
    • /
    • 2007
  • This study was to compare some acoustic characteristics of vowels produced by children with cochlear implant (CI) and the children with normal hearing. 20 subjects under ten years old were further classified into two groups (one group of CI children under four years old and the other group of CI children over four years old). For the normal hearing group, 20 subjects are participated in the experiment. Some acoustic parameters including fundamental frequency (F0) and formant frequencies (F1, F2) were measured in the two groups according to the age of cochlear implant operation. For the CI group, three comer vowels (/a/, /i/, /u/) were recorded five times in isolation and analyzed with Multi-Speech (Kay Elemetrics, model 3700), and two independent t-tests on their formant data were conducted using SPSS 11.5. The result showed that the implanted group over four years had a significant difference in F0 and F1 comparing with the implanted group under four years of age as well as the normal hearing group. Those values of the children with the implanted group under four years old were closer to those of the children with the normal hearing. As to the F2, there was no significant difference among implanted groups. However, it was shown that the vowel space for the implanted groups regardless the operation age indicated much smaller than that for the normal hearing children. This acoustic results suggest that CI surgery would be much more effective if it is done under the age of four years old.

  • PDF