• Title/Summary/Keyword: fundamental frequency of speech

Search Result 203, Processing Time 0.022 seconds

Synthesis and Evaluation of Prosodically Exaggerated Utterances

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.1 no.3
    • /
    • pp.73-85
    • /
    • 2009
  • This paper introduces the technique of synthesizing and evaluating human utterances with exaggerated or atypical prosody. Prosody exaggeration can be implemented by manipulating either the fundamental frequency (F0) contour, the segmental durations, or the intensity contour of an utterance. Of these three prosodic elements, two or more can be exaggerated at the same time. The algorithms of synthesis and evaluation were suggested. Learner utterances exaggerated in each of the three prosodic features were evaluated with respect to their original native versions in terms of the differences in their F0 contours, the segmental durations, and the intensity contours. The measure of differences was the Euclidean distance metric between the matching points in their F0 and intensity contours. The measure was calculated after the exaggerated learner utterances were aligned by the segments and rendered identical to their native version in terms of their segmental durations. For the evaluation of the segmental durations, no prior modifications were made in durations and the same measure was used. The results from the pilot experiment suggest the viability of this measure in the evaluation of learner utterances with atypical prosody with respect to their native versions.

  • PDF

The fundamental frequency (f0) distribution of Korean speakers in a dialogue corpus using Praat and R (Praat과 R로 분석한 한국인 대화 음성 말뭉치의 fundamental frequency(f0)값 분포)

  • Byunggon Yang
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.17-25
    • /
    • 2023
  • This study examines the fundamental frequency(f0) distribution of 2,740 Korean speakers in a dialogue speech corpus. Praat and R were used for the collection and analysis of acoustical f0 data after removing extreme values considering the interquartile f0 range of the intonational phrases produced by each individual speaker. Results showed that the average f0 value of all speakers was 185 Hz and the median value was 187 Hz. The f0 data showed a positively skewed distribution of 0.11, and the kurtosis was -0.09, which is close to the normal distribution. The pitch values of daily conversations varied in the range of 238 Hz. Further examination of the male and female groups showed distinct median f0 values: 114 Hz for males and 199 Hz for females. A t-test between the two groups yielded a significant difference. The skewness representing the distribution shape was 1.24 for the male group and 0.58 for the female group. The kurtosis was 5.21 and 3.88 for the male and female groups, and the male group values appeared leptokurtic. A regression analysis between the median f0 and age yielded a slope of 0.15 for the male group and -0.586 for the female group, which indicated a divergent relationship. In conclusion, a normative f0 distribution of different Korean age and sex groups can be examined in the conversational speech corpus recorded by a massive number of participants. However, more rigorous data might be required to define a relation between age and f0 values.

Intrinsic Fundamental Frequency(Fo) of Vowels in the Esophageal Speech (식도음성의 고유기저주파수 발현 현상)

  • 홍기환;김성완;김현기
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.9 no.2
    • /
    • pp.142-146
    • /
    • 1998
  • Background : It has been established that the fundamental frequency(Fo) of the vowels varies systemically as a function of vowel height. Specifically, high vowels have a higher Fo than low vowels. Two major explanations or hypotheses dominate contemporary accounts of fired to explain the mechanisms underlying intrinsic variation in vowel Fo, source-tract coupling hypothesis and tongue-pull hypothesis. Objectives : Total laryngectomy surgery necessiates removal of all structures between the hyoid bone and the tracheal rings. Therefore, the assumption that no direct interconnection exists between the tongue and pharyngoesophageal segment that would mediate systematic variation in vowel Fo appears quite reasonable. If tongue-pull hypothesis is correct, systemic differences in Fo between high versus low vowels produced by esophageal speakers would not Or expected. We analyzed the Fo in the vowels of esophageal voice. Materials and method : The subjects were 11 cases of laryngectomee patients with fluent esophageal voice. The five essential vowels were recorded and analyzed with computer speech analysis system(Computerized Speech Lab). The Fo was measured using acoustic waveform, automatically and manually, and narrow band spectral analysis. Results : The results of this study reveal that intrinsic variation in vowel Fo is clearly evident in esophageal speech. By analysis using acoustic waveform automatically, the signals were too irregular to measure the Fo precisely. So the data from automatic analysis of acoustic waveform is not logical. But the Fo by measuring with manually calculated acoustic waveform or narrowband spectral analysis resulted in acceptable results. These results were interpreted to support neither the source-tract coupling nor the tongue-pull hypotheses and led us to offer an alternative explanation to account for intrinsic variation of Fo.

  • PDF

The Comparison of Fundamental Frequencies of Children with Different Hearing Level (청력수준에 따른 초등학교 아동의 기본주파수 비교)

  • Yoon Misun
    • MALSORI
    • /
    • no.52
    • /
    • pp.49-60
    • /
    • 2004
  • The purpose of this paper was to evaluate the effect of hearing level on fundamental frequencies in children. Participants totaled sixty children divided by three groups: congenitally deafened children with cochlear implantation(CI), congenitally deafened children with hearing aids(HA), and children with normal hearing(NH). Fundamental frequencies were measured during the sustained phonation of a vowel /a/. There was statistically significant difference of fundamental frequencies across the groups(p<.01). In post hoc analysis, HA and NH group showed statistically significant difference, but CI group didn't showed significant differences with two groups. In correlation analysis between F0 and the chronological age, there were significant negative tendencies in CI and NH group, but not in HA group. The characteristics of fundamental frequency in CI group were found similar to NH group than HA group in this study. Therefore the results of this study suggest that the hearing level is one of the influencing factors to the fundamental frequency of children.

  • PDF

On a Detection for the Fundamental Frequency of Speech Signals (음성신호의기본주파수 검출)

  • 배명진
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.42-47
    • /
    • 1994
  • A pitch detector is an essential component in a variety of speech processing systems. Besides providing valuable insights into the nature of the exciation source for speech production, the pitch contour of an utterance is useful for recognizing speakers, aids-to-the handicapped, and is required in almost all speech analysis-synthesis system. Because of the importance of the pitch detection, a wide variety algorithms for pitch detection have been proposed in speech procesing literature. Thus, in this paper we discuss th evarious type of pitch detection algorithms which have been proposed until now. Then we provide th eperformance measurements for seven pitch detection algorithms.

  • PDF

Filtering of a Dissonant Frequency Combined with Noise Reduction for Speech Enhancement (잡음 감소와 불협화음 제거를 통한 음성신호 향상)

  • Sangki Kang;Lee, Youn-Jeong;Lee, Ki-Yong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1E
    • /
    • pp.16-18
    • /
    • 2004
  • There have been numerous studies on the enhancement of the noisy speech signal. In this paper, I propose a completely new speech enhancement method, that is, a filtering of a dissonant frequency combined with noise reduction algorithm. The simulation results indicate that the proposed method provides a significant gain in audible improvement compared with the conventional method. Therefore if the proposed enhancement scheme is used as a pre-filter, the perceptual quality of speech is greatly enhanced.

Correlation Between the External Laryngeal Length and the Habitual Speaking Fundamental Frequency (외 후두부 길이와 발화기본주파수 간의 상관관계)

  • Nam, Do-Hyun;Rheem, Sung-Sue;Choi, Hong-Sik
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.187-193
    • /
    • 2009
  • For this study, the external laryngeal lengths of 9 females and 9 males with normal voices were measured together with their ages, heights, and weights, and after they read aloud sentences for 3 minutes, their habitual speaking fundamental frequencies, speaking low pitches, speaking high pitches, and vocal fold closed quotients were measured. The Spearman rank correlation analysis on these data showed a significant negative correlation between the external laryngeal length and the habitual speaking fundamental frequency for both females and males, a significant negative correlation between the external laryngeal length and the speaking high pitch for only males, a significant negative correlation between the external laryngeal length and the speaking low pitch for both females and males, and a significant positive correlation between the external laryngeal length and the vocal fold closed quotient for only males.

  • PDF

Korean prosodic properties between read and spontaneous speech (한국어 낭독과 자유 발화의 운율적 특성)

  • Yu, Seungmi;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.39-54
    • /
    • 2022
  • This study aims to clarify the prosodic differences in speech types by examining the Korean read speech and spontaneous speech in the Korean part of the L2 Korean Speech Corpus (speech corpus for Korean as a foreign language). To this end, the articulation length, articulation speed, pause length and frequency, and the average fundamental frequency values of sentences were set as variables and analyzed via statistical methodologies (t-test, correlation analysis, and regression analysis). The results found that read speech and spontaneous speech were structurally different in the form of prosodic phrases constituting each sentence and that the prosodic elements differentiating each speech type were articulation length, pause length, and pause frequency. The statistical results show that the correlation between articulation speed and articulation length was highest in read speech, explaining that the longer a given sentence is, the faster the speaker speaks. In spontaneous speech, however, the relationship between the articulation length and the pause frequency in a sentence was high. Overall, spontaneous speech produces more pauses because short intonation phrases are continuously built to make a sentence, and as a result, the sentence gets lengthened.

A Study of the Pitch Estimation Algorithms of Speech Signal by Using Average Magnitude Difference Function (AMDF) (AMDF 함수를 이용한 음성 신호의 피치 추정 Algorithm들에 관한 연구)

  • So, Shinae;Lee, Kang Hee;You, Kwang-Bock;Lim, Ha-Young;Park, Jisu
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.4
    • /
    • pp.235-242
    • /
    • 2017
  • Peaks (or Nulls) finding algorithms for Average Magnitude Difference Function (AMDF) of speech signal are proposed in this paper. Both AMDF and Autocorrelation Function (ACF) are widely used to estimate a pitch of speech signal. It is well known that the estimation of the fundamental requency (F0) for speech signal is not only important but also very difficult. In this paper, two algorithms, are exploited the characteristics of AMDF, are proposed. First, the proposed algorithm which has a Threshold value is applied to the local minima to detect a pitch period. The Other proposed algorithm to estimate a pitch period of speech signal is utilized the relationship between AMDF and ACF. The data in this paper, is recorded by using general commercial device, is composed of Korean emotion expression words. The recorded speech data are applied to two proposed algorithms and tested their performance.

Acoustic Characteristics of the Voices of Korean Normal Adults by Gender on MDVP (성별에 따른 한국 정상 성인 음성의 음향학적 평가 기준치)

  • Kim, Jae-Ock
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.147-157
    • /
    • 2009
  • The purpose of the study is to develop the normal voice database and to analyze the acoustic characteristics of Korean adults' voices by gender using MDVP. Eight categories in the 34 parameters of MDVP were analyzed in the voices of 170 Korean normal adults taken from /a/ vowel. Among them, Fundamental Frequency Parameters and Frequency Perturbation Parameters were significantly different by gender. In addition, Fundamental Frequency Parameters of our data were remarkably different from the data suggested in the MDVP program which currently used in clinics. Therefore, the data obtained from the current study can be effectively used for the diagnosis of voice disorders of Korean adults as the standard parameter values of MDVP.

  • PDF