• Title/Summary/Keyword: Vocal pitch

Search Result 144, Processing Time 0.023 seconds

Acoustic Analysis of Normal and Vocal Pathologic Voice Using Dr. Speech Science (Dr. Speech Science를 이용한 정상 및 후두질환 환자의 음향분석)

  • Lee, Hyung-Seok;Tae, Kyung;Jang, Kyung-Jin;Kim, Kyung-Woo;Kim, Kyung-Rae;Park, Chul-Won
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.8 no.2
    • /
    • pp.166-172
    • /
    • 1997
  • Background : For example, aerodynamic study, vibratory study, acoustic study, neuro-muscular test and psychoacoustic evaluation, a number of objective methods are now available for assessing pathologic voice change. They help to differentiate pathologic condition from normal condition and to monitor pathologic and aging change. These laboratory analyses are used commonly to monitor speech therapy and to follow a patient's recovery after surgery. Objectives : We investigated the values of jitter, shimmer and NNE of normal person and hoarseness patients in Korea. The values of Jitter and shimmer might be meaningful parameters distinguishing pathologic vibration from normal and recovery after surgery. Materials and Methods : Statistical significance between normal control and 48 subjects taken microlaryngeal surgery were compared with Dr. speech science program that is computerized system for acoustic analysis of voice production employed to determine vocal characteristics of pitch perturbation(jitter) and amplitude perturbation(shimmer). Results : The mean normal values of jitter and shimmer were 0.226${\pm}$0.110(%), 2.200${\pm}$0.421(%) in male and 0.164${\pm}$0.060(%), 2.063 ${\pm}$0.575(%) in female. In patients with vocal nodule, the preoperative and postoperative values of jitter and shimmer were valueless. In patients with vocal polyps, the preoperative and postoperative values of jitter and shimmer were valuable. Conclusion : Dr. speech science program was effective to monitor laryngeal disease and aging changes.

  • PDF

Listener's Age Estimation by Prosody Manipulation (운율 변조 양상에 따른 청자의 연령 지각)

  • Kim, Jiyoun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.81-88
    • /
    • 2014
  • The normal aging process on speech production and these changes are perceived by listeners. This study examined whether age perception changed under various conditions of prosodic manipulations in normal listeners, comparing the prosodic changes according to age and sex in adulthood. The older and younger voices were resynthesized by manipulation of the speaking rate and pitch to shift the perceived age of the groups toward each other. Two-way repeated ANOVA were conducted to determine if the prosodic type of resynthesized cue resulted in a significant shift in perceived age of young and old voices. The manipulation of the speaking rate resulted in a significant shift in perceived age for the older and younger groups. A significant shift in age estimates was not observed for the younger male group when pitch was manipulated. There were significant gender-by-age group interactions for prosodic manipulation type. Age-related changes in the prosodic properties of speech may ultimately influence speech perception.

A Study on Korean and English Speaker Recognitions using the Fuzzy Theory (퍼지 이론을 이용한 한국어 및 영어 화자 인식에 관한 연구)

  • 김연숙;김희주;김경재
    • Journal of the Korea Society of Computer and Information
    • /
    • v.7 no.3
    • /
    • pp.49-55
    • /
    • 2002
  • This paper proposes speaker recognition algorithm which includes both the pitch parameter and the fuzzy. This study proposes a pitch detection method for the peak and valley pitch detection function by means of comparing spectra which utilizes the transform characteristics between time and frequency. It measures the similarity to the original spectrum while arbitrarily varying the period in the time domain. It heavily weights the error due to the changing characteristics of the phonemes, while it is strong against noise. In this paper, makes reference pattern using membership function and performs vocal track recognition of common character using fuzzy pattern matching in odor to include time variation width for non-linear utterance time.

  • PDF

A Study on Korean and Japanese Speaker Recognitions using the Fuzzy Theory (퍼지 이론을 이용한 한국어 및 일어 화자 인식에 관한 연구)

  • 김연숙;김창완
    • Journal of the Korea Society of Computer and Information
    • /
    • v.5 no.3
    • /
    • pp.51-57
    • /
    • 2000
  • This paper proposes speaker recognition algorithm which includes both the pitch and the fuzzy. This study proposes a pitch detection method for the peak and valley pitch detection function by means of comparing spectra which utilizes the transform characteristics between time and frequency. It measures the similarity to the original spectrum while arbitrarily varying the period in the time domain. It heavily weights the error due to the changing characteristics of the phonemes, while it is strong against noise. In this paper, makes reference pattern using membership function and performs vocal track recognition of common character using fuzzy pattern matching in order to include time variation width for non-linear utterance time.

  • PDF

Assessments of Professional Voice (전문 성악인 교육 평가 방법 연구: 음향분석 컴퓨터 시스템 및 후두 회신경을 사용하여)

  • Kim, S.S.;Kim, H.G.;Hong, K.H.
    • Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.115-139
    • /
    • 1998
  • The aim of this study is to develop an the assessment program for the singing voice which is based on the physiological and acoustic methods. 22 sopranos, 6 mezzo sopranos, 4 tenors and 4 baritones participated to these experiments. The results measured by Visi-Pitch, spectrograph, and strobo-scope can be summarized as follows: (1) The maximum phonation time of singers must over 14 second higher with one deep inspiration (2) The parts classified by vocal range using Visi-Pitch: soprano between 167Hz $\sim$1,190Hz, mezzo soprano between 146Hz$\sim$956Hz, tenor between 75Hz$\sim$503Hz and baritone between 73 Hz and 385 Hz. (3) Longitudinal glottal size of singers decreases depending on the high-low pitch variation while lattitudinal glottal size increases depending on high-low pitch variation. (4) Well-trained singers show over 5 times the vibrato rate of untrained singers and regular pitch variation during measured periods. Vibrato's intensity do not over 3 dB. (5) Singer's formant indicates professional voice depending on the each parts: 3,207 Hz for soprano, 3,057 Hz for mezzo soprano, 2,754 Hz for tenor and 2,560 Hz for baritone.. (6) $F_1$ of singing voice is higher than that of speech while $F_2\;and\;F_3$ of singing voice are lower than those of speech.

  • PDF

A Study on Voice Analytical the Vocal Cord and Formant Change in the Smoking and Secondhand Smoking Environments (직.간접흡연 환경에서의 성대 및 음형대 변화에 대한 음성 분석학적 연구)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.36 no.6B
    • /
    • pp.720-727
    • /
    • 2011
  • Modern people has been increased interest about health care and maintenance as emerging well-being and social issues. In particular, the smoking is not good for the recognition much greater importance is the massive spread of the smoking is low. The smoking has much adverse effects body's respiratory and circulatory organ many and it is recognized as a serious danger to our health the smoking as well as secondhand smoking. In this paper, we were carried out study analysis comparison to apply though voice analytical elements techniques have a influence vocal cords and formants in the environment smoking and secondhand smoking. For this purpose, we organized subjects group smoker and nonsmoker in 20's man and to collect voice of the smoke and Secondhand Smoking before after then we carried out study analysis experimental results Pitch, Jitter, Shimmer, 5~8 Formant Frequency.

Variance characteristics of speaking fundamental frequency and vocal intensity depending on utterance conditions (발화조건에 따른 기본주파수 및 음성강도 변동의 특징)

  • Lee, Moo-Kyung
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.111-118
    • /
    • 2012
  • The purpose of this study was to characterize and determine variances of speaking fundamental frequency and vocal intensity depending on gender and three utterance conditions (spontaneous speech, reading, and counting). A total of 65 undergraduate students (32 male students, 33 female students) attending universities in Daegu, South Korea participated in this study. The subjects were all in their 20s. This study used KayPENTAX's Visi-Pitch IV (Model 3950) to measure the variances of speaking fundamental frequency (SFF0) and vocal intensity (VI). As a result, this study came to the following conclusions. First, it was found that both males and females showed no significant difference in SFF0 and vocal intensity among three utterance conditions. Second, this study sought to analyze differences in the variances of SFF0 between males and females. As a result, it was found that females showed significantly higher levels of four measured variances (SFF0 $SD^{**}$, SFF0 $range^{***}$, Min $SFF0^{***}$ and Max $SFF0^{***}$) than males on spontaneous speech. However, it was found that there was no significant difference between males and females in SFF0 range on reading or in SFF0 SD and SFF0 range on counting. It was found that there was no significant difference between males and females in the level of measured variances of vocal intensity depending on utterance conditions. Finally, this study made a comparison and analysis on differences in the variances of SFF0 and vocal intensity among utterance conditions. As a result, it was found that all the measured variances of SFF0 in males were most significantly reduced depending upon spontaneous speech which was followed by reading and counting respectively (SFF0 SD: p<.001, SFF0 range: p<.05, Max SFF0: p<.05). Females however, show no significant difference in the measured variances of SFF0 depending upon three utterance conditions. It was also found that the measured variances of vocal intensity in females were most significantly reduced depending on spontaneous speech that was followed by reading and counting (VI SD: p<.001, VI range: p<.001, Min VI: p<.01 Max VI: p<.05), while males showed no significant difference in the measured variances of vocal intensity depending on three utterance conditions. In sum, these findings suggest that variances of SFF0 in males are affected by three utterance conditions, while variances of vocal intensity in females are affected by three utterance conditions.

Study on Correlation between Acoustic Profiles and Fatigue (노권상과 음성 지표간의 상관성에 관한 연구)

  • Cho, Shin-Woong;Park, Young-Bae;Park, Young-Jae
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.14 no.1
    • /
    • pp.15-35
    • /
    • 2010
  • Objectives : The purpose of this study is to find out the correlation between the Vocal indicators and the 'Buzhongyiqi-Tang questionnaire' and the 'Chalder fatigue scale.' Methods : This study examined the mean value of each factor in the 'Buzhongyiqi-Tang Questionnaire,' 'Chalder fatigue scale' and the different voice indicators conducted on 81 healthy adult participants in relation to the results of the /a/ /e/ /i/ /o/ /u/ pronunciation test. Results : There was significant correlation between the pronunciation of /a/ /e/ /i/ /o/ /u/ vowels' F0 indexes and 'the Deficiency symptoms of Buzhongyiqi-Tang'. The results of the regression analysis showed the following significant findings for each pronouncing vowels: /i/ as a factor for 'the Deficiency symptoms of Buzhongyiqi-Tang'.; /a/ for 'the Consumptive fever of Buzhongyiqi-Tang'.; /i/ for 'the Vocal inflammation of Buzhongyiqi-Tang.; and /e/ as a factor of 'the Chadler physical fatigue'. Conclusions : The study showed a negative correlation between the Fundamental Frequency and the mean value of the questionnaire, which could be understood as higher the fatigue level, increased level of vocal vibration and higher pitch tone compared to the less fatigued group. We expect future studies to conduct research on methods to diagnose other illnesses using the vocal indicators based on the correlation between the vocal index and illnesses prescribed under traditional oriental medicine.

A Study on the Channel Normalized Pitch Synchronous Cepstrum for Speaker Recognition (채널에 강인한 화자 인식을 위한 채널 정규화 피치 동기 켑스트럼에 관한 연구)

  • 김유진;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1
    • /
    • pp.61-74
    • /
    • 2004
  • In this paper, a contort- and speaker-dependent cepstrum extraction method and a channel normalization method for minimizing the loss of speaker characteristics in the cepstrum were proposed for a robust speaker recognition system over the channel. The proposed extraction method creates a cepstrum based on the pitch synchronous analysis using the inherent pitch of the speaker. Therefore, the cepstrum called the 〃pitch synchronous cepstrum〃 (PSC) represents the impulse response of the vocal tract more accurately in voiced speech. And the PSC can compensate for channel distortion because the pitch is more robust in a channel environment than the spectrum of speech. And the proposed channel normalization method, the 〃formant-broadened pitch synchronous CMS〃 (FBPSCMS), applies the Formant-Broadened CMS to the PSC and improves the accuracy of the intraframe processing. We compared the text-independent closed-set speaker identification on 56 females and 112 males using TIMIT and NTIMIT database, respectively. The results show that pitch synchronous km improves the error reduction rate by up to 7.7% in comparison with conventional short-time cepstrum and the error rates of the FBPSCMS are more stable and lower than those of pole-filtered CMS.

Efficient Tracking of Speech Formant Using Closed Phase WRLS-VFF-VT Algorithm

  • Lee, Kyo-Sik;Park, Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.2E
    • /
    • pp.8-13
    • /
    • 2000
  • In this paper, we present an adaptive formant tracking algorithm for speech using closed phase WRLS-VFF-VT method. The pitch synchronous closed phase methods is known to give more accurate estimates of the vocal tract parameters than the pitch asynchronous method. However the use of a pitch-synchronous closed phase analysis method has been limited due to difficulties associated with the task of accurately isolating the closed phase region in successive periods of speech. Therefore we have implemented the pitch synchronous closed phase WRLS-VFF-VT algorithm for speech analysis, especially for formant tracking. The proposed algorithm with the variable threshold(VT) can provide a superior performance in the boundary of phone and voiced/unvoiced sound. The proposed method is experimentally compared with the other method such as two channel CPC method by using synthetic waveform and real speech data. From the experimental results, we found that the block data processing techniques, such as the two-channel CPC, gave reasonable estimates of the formant/antiformant. However, the data windows used by these methods included the effects of the periodic excitation pulses, which affected the accuracy of the estimated formants. On the other hand the proposed WRLS-VFF-VT method, which eliminated the influence of the pulse excitation by using an input estimation as part of the algorithm, gave very accurate formant/bandwidth estimates and good spectral matching.

  • PDF