• Title/Summary/Keyword: Vocal pitch

Search Result 144, Processing Time 0.021 seconds

The Effect of Voice Therapy for the Treatment of Functional Aphonia: A Preliminary Study (기능적 실성증에 대한 음성치료의 효과 분석: 기초 연구)

  • Kim, No Eul;Kim, Jun Seok;Oh, Jae Hwan;Kim, Dong Young;Woo, Joo Hyun
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.32 no.2
    • /
    • pp.75-80
    • /
    • 2021
  • Background and Objectives Functional aphonia refers to in which by presenting whispering voice and almost producing very high-pitched tensed voices are produced. Voice therapy is the most effective treatment, but there is a lack of consensus for application of voice therapy. The purpose of this study was to examine the vocal characteristics of functional aphonia and the effect of voice therapy applied accordingly. Materials and Method From October 2019 to December 2020, 11 patients with functional aphonia were treated using voice therapy which was processing three stages such as vocal hygiene, trial therapy, and behavioral therapy. Of these, 7 patients who completed the voice evaluation before and after voice therapy was enrolled in this study. By retrospective chart review, clinical information such as sex, age, symptoms, duration, social and medical history, process of voice therapy, subjective and objective findings were analyzed. Voice parameters before and after voice therapy were compared. Results In GRBAS study, grade, rough, and asthenic, and in Consensus Auditory-Perceptual Evaluation of Voice, overall severity, roughness, pitch, and loudness were significantly improved after voice therapy. In Voice handicap index, all of the scores of total and sub-categories were significantly decreased. In objective voice analysis, jitter, cepstral peak prominence, and maximum phonation time were significantly improved. Conclusion The voice therapy was effective for the treatment of functional aphonia by restoring patient's vocalization and improving voice quality, pitch and loudness.

A Study on SNR Estimation of Continuous Speech Signal (연속음성신호의 SNR 추정기법에 관한 연구)

  • Song, Young-Hwan;Park, Hyung-Woo;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.4
    • /
    • pp.383-391
    • /
    • 2009
  • In speech signal processing, speech signal corrupted by noise should be enhanced to improve quality. Usually noise estimation methods need flexibility for variable environment. Noise profile is renewed on silence region to avoid effects of speech properties. So we have to preprocess finding voice region before noise estimation. However, if received signal does not have silence region, we cannot apply that method. In this paper, we proposed SNR estimation method for continuous speech signal. The waveform which is stationary region of voiced speech is very correlated by pitch period. So we can estimate the SNR by correlation of near waveform after dividing a frame for each pitch. For unvoiced speech signal, vocal track characteristic is reflected by noise, so we can estimate SNR by using spectral distance between spectrum of received signal and estimated vocal track. Lastly, energy of speech signal is mostly distributed on voiced region, so we can estimate SNR by the ratio of voiced region energy to unvoiced.

Comparison of Korean Speech De-identification Performance of Speech De-identification Model and Broadcast Voice Modulation (음성 비식별화 모델과 방송 음성 변조의 한국어 음성 비식별화 성능 비교)

  • Seung Min Kim;Dae Eol Park;Dae Seon Choi
    • Smart Media Journal
    • /
    • v.12 no.2
    • /
    • pp.56-65
    • /
    • 2023
  • In broadcasts such as news and coverage programs, voice is modulated to protect the identity of the informant. Adjusting the pitch is commonly used voice modulation method, which allows easy voice restoration to the original voice by adjusting the pitch. Therefore, since broadcast voice modulation methods cannot properly protect the identity of the speaker and are vulnerable to security, a new voice modulation method is needed to replace them. In this paper, using the Lightweight speech de-identification model as the evaluation target model, we compare speech de-identification performance with broadcast voice modulation method using pitch modulation. Among the six modulation methods in the Lightweight speech de-identification model, we experimented on the de-identification performance of Korean speech as a human test and EER(Equal Error Rate) test compared with broadcast voice modulation using three modulation methods: McAdams, Resampling, and Vocal Tract Length Normalization(VTLN). Experimental results show VTLN modulation methods performed higher de-identification performance in both human tests and EER tests. As a result, the modulation methods of the Lightweight model for Korean speech has sufficient de-identification performance and will be able to replace the security-weak broadcast voice modulation.

Voice Analysis of Highest Falsetto and Lowest Modal Voice (가성구와 흉성구의 객관적인 음성분석)

  • 진성민;송윤경;권기환;이경철;반재호
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.13 no.2
    • /
    • pp.151-154
    • /
    • 2002
  • Background and Objectives : The pitch range of the human voice is variable, extending from chest register to falsetto register. Although numerous studies have investigated after laryngeal mechanism description of falsetto tone, systematic and objective studies were lack. The purpose of this study was to systematically analyze and compare modal with falsetto voice. Materials and Methods : Seven adult baritones were selected from a larger population of volunteers at choir. Simultaneous measurements of acoustic, electroglottographic and aerodynamic study were made during /e/ sustained in two vocal registers, lowest modal and highest falsetto. Statistical analysis was performed using Wilkoxson signed rankes test. Results : In the acoustic analysis, shimmer was increased in flasetto voice(p<0.05). In the electroglottographic analysis, closed quotient(CQ), speed quotient(SQ) at the modal voice were higher than at the falsetto voice(p<0.05). In the aerodynamic analysis, and airflow rate(MFR) of falsetto voice was higher than modal voice(p<0.05). Conclusions : In the results of the study indicate that, falsetto register ineffective, inefficient, generally unpleasant because it was produced by incomplete clousure of true vocal cord. We anticipated that further study with large samples can provide an objective criteria for status and classification of singer's modal and falsetto voice.

  • PDF

Voice quality of normal elderly people after a 3oz water-swallow test: An acoustic analysis (3온스 물 삼킴검사 이후 정상 노년층의 음질 변화: 음향학적 분석)

  • Lee, Sol Hee;Choi, Hong-Shik;Choi, Seong-Hee;Kim, HyangHee
    • Phonetics and Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.69-76
    • /
    • 2018
  • The elderly are at increased risk of developing dysphagia due to aging and illnesses. The aim of the current study was to analyze, via an acoustic study, the change in the voice quality of normal elderly people after a 3oz water-swallow test. Subjects included a group of 60 normal elderly people (age: $mean{\pm}SD=76.9{\pm}6.66$) and 60 healthy young adults (age: $mean{\pm}SD=25.1{\pm}2.36$). Every participant produced a five-second /a/ phonation pre- and post-swallowing, and the fractioned two-second sections were analyzed using the MDVP (multi dimensional voice program) analysis. The elderly group demonstrated a post-swallowing increase in the following related acoustic parameters: fundamental frequency, fundamental frequency variation, amplitude-variation, and noise in both two-second sections. However, the younger group showed an increase only in frequency related acoustic parameters (i.e., STD ) in the first two-second section. The significant changes in values in the post-swallowing parameters might indicate temporary irregularities in pitch and amplitude along with higher amounts of noise in the voice. The results could be attributed to water residues in the vocal fold and vocal tract, as well as a deterioration of the motor and sensory functions caused by anatomical and physiological changes that result from aging.

A Comparative Study on the Public Speech Spectrum between ROK and USA Politicians (한국과 미국 정치인 대중연설 음성의 스펙트럼 비교 연구)

  • Chung, Eun-Ee;Lee, Sang-Ho
    • Journal of Digital Contents Society
    • /
    • v.17 no.3
    • /
    • pp.143-155
    • /
    • 2016
  • In this study, we focused on the importance of politicians' voices in sending a message. Different factors for a voice may play different roles in sending a message and affect message recipients' responsiveness, understanding, and so on. For this reason, it can be said that an analytical study on voices in sending a diversity of messages is a meaningful attempt. We took interest in politicians' voices because we determined that a voice should be very important to politicians frequently sending a message through speech to the nation and others. This study aimed to investigate the voices of politicians, who represent their nation. We intended to select politicians representing ROK(Republic of Korea; South Korean) and USA(United States of America), choose representative speeches to the nation, make a comparative analysis of their voices in the speeches, and draw implications. We analyzed a total of eight voices - four ROK politicians and four USA ones, male and female - to characterize them and suggest guidelines for a voice with clearer message delivery. We analyzed the politicians' voices on the basis of such vocal properties as vocal pitch, accuracy of pronunciation, resonance, and intonation variation and found that the ROK politicians were somewhat poorer at utilizing their voice than the US ones. In particular, they were remarkably poorer at accurate pronunciation, which exerts a significant impact on message passing.

Study of Developing SOP for Extracting Stable Vocal Features for Accurate Diagnosis (음성의 안정적 변수 추출을 위한 SOP 개발 연구)

  • Kim, Keun-Ho;Jang, Jun-Su;Kim, Young-Su;Kim, Jong-Yeol
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.25 no.6
    • /
    • pp.1108-1112
    • /
    • 2011
  • Voice can be widely used to classify the four constitution types and to recognize one's health condition from extracting meaningful features as physical quantity in traditional Korean medicine or Western medicine. In this paper, we proposed the method to update the standard operating procedure (SOP) to acquire and record voices for extracting stable vocal features since they are sensitive to the variation of a subject's utterance. At first, we obtained pitch frequencies from vowels and the sentence and intensity form the sentence as features with voices acquired under subjects' utterance conditions and then the deviation ratios of features from median values according to the utterance conditions were obtained and the condition to minimize the ratio was selected as a new SOP. As a result, we decided the SOP for a subject to utter vowels with the length of 2s~1s and sentences with over 2s interval between them after practice, in consideration of the deviation and qualitative requirements. Stable voice features obtained from updated SOP produce accurate diagnosis, which will be developed and simplified for using in the u-Healthcare system of personalized medicine.

Acoustic Characteristics on the Adolescent Period Aged from 16 to 18 Years (16~18세 청소년기 음성의 음향음성학적 특성)

  • Ko, Hye-Ju;Kang, Min-Jae;Kwon, Hyuk-Jae;Choi, Yaelin;Lee, Mi-Geum;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.81-90
    • /
    • 2013
  • During adolescence the mutational period is characterized by the changes in the laryngeal structure, the length of the vocal cords, and a tone of voice. Usually, adolescents at 15 or 16 reach the voice of adults but the mutational period is sometimes delayed. Therefore, studies on the voice of adolescents between 16 ~ 18 right after the mutational period are required. Accordingly, this paper attempted to provide basic data about the normal standard for patients with voice disorders during this period by evaluating the vocal characteristics of males and females between 16 ~ 18 with an objective device bycomparing and analyzing them by sex and age. The study was conducted on a total of 60 subjects composed of each 10 subjects of each age. The vocal analysis was conducted by MPT (Maximum Phonation Time) measurement, sustained vowels and sentence reading. As for /a/ sustained vowels, fundamental frequency, hereinafter referred to as $F_0$, jitter, shimmer, noise-to-harmonic ratio, hereinafter referred to as NHR were measured by using the Multi-dimensional voice program (MDVP) among the Multi-Speech program of Computerized Speech Lab (Kay Elemetrics). The sentence reading, mean $F_0$, maximum $F_0$ and minimum $F_0$ were measured using the Real-Time Pitch (RTP) Model 5121 among the Multi-Speech program of Computerized Speech Lab (Kay Elemetrics). As a result, according to sex, there were statistically significant differences in $F_0$, jitter, shimmer, mean $F_0$, maximum $F_0$, and minimum $F_0$; and according to age, there were statistically significant differences in MPT. In conclusion, the voice of the adolescents between 16 ~ 18 reached the maturity levels of adults but the voice quality which can be considered on the scale of voice disorders showed transition to the voice of an adult during the mutational period.

The Acoustic Severity Index in the Pathologic Voice (음성장애에 대한 음향학적 중등도 지표)

  • Hong, Ki-Hwan;Kim, Hyun-Ki;Yang, Yoon-Soo
    • Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.201-219
    • /
    • 2003
  • Background: The perceptual assessment is generally performed by the voice specialist. The objective evaluation is performed in a voice laboratory. Research in voice laboratories has generated a variety of different objective tests and parameters. The perceptual evaluation is one of the most controversial topics in voice research. Review of literature reveals a wide variety of rating scales and reliability data fluctuating from study to study. Unfortunately, there is no widely accepted valid method for classifying voice disorders and assessing outcome after voice treatment. Objectives: The goals of this research were to identify important objective acoustic parameters of vocal quality, and to establish an objective and quantitative correlate of the perceived vocal quality. Materials and Methods : We evaluated the voice analyzed data from 122 dysphonic patients and 20 normal volunteers. A computerized speech lab. 4300B(CSL) was used to carry out the analysis of each voice sample. Results: Three dysphonia severity indices(DSI) were created using discriminant analysis. DSI is based on the weighted combination of the following selected set of acoustic parameters: absolute jitter(Jita in us), smoothed pitch period perturbation (sPPQ in %), amplitude perturbation quotient(APQ in %), soft phonation index(SPI), average fundamental frequency(Fo in Hz), lowest fundamental frequency(Flo in Hz), and smoothed amplitude perturbation quotient(sAPQ in %). The DSI, being the discriminating rule calculated by the logistic regression, consists of three equation based on statistically significant acoustic parameters. Three DSI were created to reflects best the degree of hoarseness as expressed by G from the GRBAS scale. The more positive this DSI is for a patient, the worse the vocal quality. The more it is negative, the better it is. The effect of sex is included implicitly in the DSI-1 and DSI-2, so that a separate DSI-1 and DSI-2 for males and females need not be used. The DSI is objective because no perceptual input is required for its calculation. Conculsion : This research demonstrates that the voice function values calculated from three different multivariate objective dysphonia severity indices are significantly associated with subjective voice assessments. These multivariate objective dysphonia severity indices may be appropriate for use in clinical trials and outcomes research on treatment effectiveness for voice disorders.

  • PDF

New Parameter on Speech and EGG; Glottal Closure Delay Ratio (음성신호와 전기성문파를 이용하는 새로운 매개변수 ; 성대 폐쇄 지연비율(Glottal Closure Delay Ratio))

  • Choi, Jong-Min;Kwon, Tack-Kyun;Jung, Eun-Jung;Lee, Myung-Chul;Kim, Kwang-Hyun;Sung, Myung-Whun;Park, Kwang-Suk
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.1
    • /
    • pp.22-25
    • /
    • 2007
  • Background and Objectives: Biomedical signals have been usually used for the diagnosis of the laryngeal function such as speech, electroglottograph(EGG), airflow and other signals. But, in most cases these signals were analysed separately. Here, we propose a new interchannel parameter Glottal Closure Delay Ratio(GCDR) which is estimated from speech and EGG measured simultaneously. Materials and Method: Speech and EGG signal were recorded simultaneously from 13 normal subjects, 39 patients. The patients' data included 16 polyps and 23 vocal folds palsy. Time difference between glottal closing instance on EGG and the first maximum peak on speech in a pitch period was calculated. Glottal closing instance was defined as the maximum peak on the first derivative of EGG signal(dEGG). Results: The standard deviation and jitter were calculated using 20-30 GCDRs extracted from each data, and they are significant different between normal and vocal fold paralysis group. Conclusion: The GCDR may be the first index reflecting speech and EGG characteristics and the perturbation of this parameter was significant different between normal and vocal fold paralysis group.

  • PDF