• 제목/요약/키워드: voice frequency

검색결과 546건 처리시간 0.032초

켑스트럼 파라미터를 이용한 후두암 검진 (Laryngeal Cancer Screening using Cepstral Parameters)

  • 이원범;전경명;권순복;전계록;김수미;김형순;양병곤;조철우;왕수건
    • 대한후두음성언어의학회지
    • /
    • 제14권2호
    • /
    • pp.110-116
    • /
    • 2003
  • Background and Objectives : Laryngeal cancer discrimination using voice signals is a non-invasive method that can carry out the examination rapidly and simply without giving discomfort to the patients. n appropriate analysis parameters and classifiers are developed, this method can be used effectively in various applications including telemedicine. This study examines voice analysis parameters used for laryngeal disease discrimination to help discriminate laryngeal diseases by voice signal analysis. The study also estimates the laryngeal cancer discrimination activity of the Gaussian mixture model (GMM) classifier based on the statistical modelling of voice analysis parameters. Materials and Methods : The Multi-dimensional voice program (MDVP) parameters, which have been widely used for the analysis of laryngeal cancer voice, sometimes fail to analyze the voice of a laryngeal cancer patient whose cycle is seriously damaged. Accordingly, it is necessary to develop a new method that enables an analysis of high reliability for the voice signals that cannot be analyzed by the MDVP. To conduct the experiments of laryngeal cancer discrimination, the authors used three types of voices collected at the Department of Otorhinorlaryngology, Pusan National University Hospital. 50 normal males voice data, 50 voices of males with benign laryngeal diseases and 105 voices of males laryngeal cancer. In addition, the experiment also included 11 voices data of males with laryngeal cancer that cannot be analyzed by the MDVP, Only monosyllabic vowel /a/ was used as voice data. Since there were only 11 voices of laryngeal cancer patients that cannot be analyzed by the MDVP, those voices were used only for discrimination. This study examined the linear predictive cepstral coefficients (LPCC) and the met-frequency cepstral coefficients (MFCC) that are the two major cepstrum analysis methods in the area of acoustic recognition. Results : The results showed that this met frequency scaling process was effective in acoustic recognition but not useful for laryngeal cancer discrimination. Accordingly, the linear frequency cepstral coefficients (LFCC) that excluded the met frequency scaling from the MFCC was introduced. The LFCC showed more excellent discrimination activity rather than the MFCC in predictability of laryngeal cancer. Conclusion : In conclusion, the parameters applied in this study could discriminate accurately even the terminal laryngeal cancer whose periodicity is disturbed. Also it is thought that future studies on various classification algorithms and parameters representing pathophysiology of vocal cords will make it possible to discriminate benign laryngeal diseases as well, in addition to laryngeal cancer.

  • PDF

음성장애에 대한 음향학적 중등도 지표 (The Acoustic Severity Index in the Pathologic Voice)

  • 홍기환;김현기;양윤수
    • 음성과학
    • /
    • 제10권4호
    • /
    • pp.201-219
    • /
    • 2003
  • Background: The perceptual assessment is generally performed by the voice specialist. The objective evaluation is performed in a voice laboratory. Research in voice laboratories has generated a variety of different objective tests and parameters. The perceptual evaluation is one of the most controversial topics in voice research. Review of literature reveals a wide variety of rating scales and reliability data fluctuating from study to study. Unfortunately, there is no widely accepted valid method for classifying voice disorders and assessing outcome after voice treatment. Objectives: The goals of this research were to identify important objective acoustic parameters of vocal quality, and to establish an objective and quantitative correlate of the perceived vocal quality. Materials and Methods : We evaluated the voice analyzed data from 122 dysphonic patients and 20 normal volunteers. A computerized speech lab. 4300B(CSL) was used to carry out the analysis of each voice sample. Results: Three dysphonia severity indices(DSI) were created using discriminant analysis. DSI is based on the weighted combination of the following selected set of acoustic parameters: absolute jitter(Jita in us), smoothed pitch period perturbation (sPPQ in %), amplitude perturbation quotient(APQ in %), soft phonation index(SPI), average fundamental frequency(Fo in Hz), lowest fundamental frequency(Flo in Hz), and smoothed amplitude perturbation quotient(sAPQ in %). The DSI, being the discriminating rule calculated by the logistic regression, consists of three equation based on statistically significant acoustic parameters. Three DSI were created to reflects best the degree of hoarseness as expressed by G from the GRBAS scale. The more positive this DSI is for a patient, the worse the vocal quality. The more it is negative, the better it is. The effect of sex is included implicitly in the DSI-1 and DSI-2, so that a separate DSI-1 and DSI-2 for males and females need not be used. The DSI is objective because no perceptual input is required for its calculation. Conculsion : This research demonstrates that the voice function values calculated from three different multivariate objective dysphonia severity indices are significantly associated with subjective voice assessments. These multivariate objective dysphonia severity indices may be appropriate for use in clinical trials and outcomes research on treatment effectiveness for voice disorders.

  • PDF

3온스 물 삼킴검사 이후 정상 노년층의 음질 변화: 음향학적 분석 (Voice quality of normal elderly people after a 3oz water-swallow test: An acoustic analysis)

  • 이솔희;최홍식;최성희;김향희
    • 말소리와 음성과학
    • /
    • 제10권2호
    • /
    • pp.69-76
    • /
    • 2018
  • The elderly are at increased risk of developing dysphagia due to aging and illnesses. The aim of the current study was to analyze, via an acoustic study, the change in the voice quality of normal elderly people after a 3oz water-swallow test. Subjects included a group of 60 normal elderly people (age: $mean{\pm}SD=76.9{\pm}6.66$) and 60 healthy young adults (age: $mean{\pm}SD=25.1{\pm}2.36$). Every participant produced a five-second /a/ phonation pre- and post-swallowing, and the fractioned two-second sections were analyzed using the MDVP (multi dimensional voice program) analysis. The elderly group demonstrated a post-swallowing increase in the following related acoustic parameters: fundamental frequency, fundamental frequency variation, amplitude-variation, and noise in both two-second sections. However, the younger group showed an increase only in frequency related acoustic parameters (i.e., STD ) in the first two-second section. The significant changes in values in the post-swallowing parameters might indicate temporary irregularities in pitch and amplitude along with higher amounts of noise in the voice. The results could be attributed to water residues in the vocal fold and vocal tract, as well as a deterioration of the motor and sensory functions caused by anatomical and physiological changes that result from aging.

Shimmer Change According to Fundamental Frequency Variation of Korean Normal Adults

  • Pyo, Hwa-Young;Sim, Hyun-Sub
    • 음성과학
    • /
    • 제10권1호
    • /
    • pp.143-152
    • /
    • 2003
  • The present study was performed to investigate change in shimmer according to $F_{0}$ variation precisely, and to offer suggestions for a clinical application. The analysis for the present study was done by the fundamental frequency ($F_{0}$) and shimmer measurement results of the previous 120 Korean normal adults' voice study of Pyo et al. (2002), used three vowels, /i/, /a/, /and /u/. Through the analysis of 60 female samples from the previous study, we found that $F_{0}$ of the vowels was the highest in /u/, and the lowest in /a/, but, on the contrary, shimmer was highest in /a/and lowest in /u/. Thirty of 60 subjects showed such an inverse relationship between $F_{0}$ and shimmer, as a whole. In the vowel /a/, 47 of 60 subjects showed the increased $F_{0}$ and decreased shimmer, in /i/, 32 subjects, and in /u/, 33 subjects showed the same results. The decrease in shimmer means the improvement of voice quality, so by these results, we expect to answer the question why the patients with spasmodic dysphonia can improve their voice quality with increased pitched voice production.

  • PDF

한국 정상 성인의 모음과 문단 산출 시 전기성문파형 측정 (The Analysis of Eletroglottographic Measures of Vowel and Sentence in Korean Healthy Adults)

  • 김재옥
    • 말소리와 음성과학
    • /
    • 제2권4호
    • /
    • pp.223-228
    • /
    • 2010
  • This study investigated the closed quotient and other voice quality parameters using electroglottography (EGG) in sustaining the vowel /a/ and reading a sentence at the comfortable pitch and loudness in healthy Korean adults. Seventy two healthy adults (36 men, 36 women) aged 20~40 years were included in the study. The tasks were recorded and analyzed using Lx Speech Studio. In vowel sustaining task, closed quotient (Qx), fundamental frequency (Fx), sound pressure level (SPL), Jitter, and Shimmer were measured. In sentence reading task, closed quotient (DQx), fundamental frequency (DFx), and sound pressure level (DAx) were measured. The sex effects were observed on Qx, Fx, Shimmer, DQx, and DFx. Men had significantly higher Qx and DQx than women, but had significantly lower Shimmer than women. However, there was no sex effect on Jitter. The task effects on Qx and SPL as well as DQx and DAx were also assessed. Qx and SPL were significantly higher than DQx and DAx in both gender. This study showed that the closed quotients in both vowel sustaining and sentence reading tasks were significantly related to other voice quality parameters. Therefore, clinicians and researchers should describe the voice quality parameters like fundamental frequency, sound pressure level, Jitter, Shimmer, and so on when reporting closed quotients using EGG.

  • PDF

신경학적 손상이 없는 갑상선 술 후 음도문제의 음성치료 (Voice therapy for pitch problems following thyroidectomy without laryngeal nerve injury)

  • 김지성;김미진
    • 말소리와 음성과학
    • /
    • 제15권3호
    • /
    • pp.53-58
    • /
    • 2023
  • 갑상선 절제술 후 정상적인 성대 움직임을 보이는 경우의 29.7%가 주관적인 음성문제를 호소하며, 이는 의사소통과 관련된 삶의 질 저하로 이어질 수 있다. 본 연구의 목적은 신경학적 손상이 없는 갑상선 절제술 후 음성의 음도문제를 개선하기 위해 경부운동과 semi-occluded vocal tract exercise를 적용하여 고안한 음성치료법의 효과를 알아보는 것이다. 이를 위해 갑상선 절제술 후 음도문제를 보이는 여성 10명을 대상으로 무작위 배정에 의힌 음성치료를 갑상선 술 2주 후 1회 실시하였다. 술 전과 술 후, 음성치료 직후의 음성 비교하기 위해 음향학적 분석[fundamental frequency, jitter, shimmer, noise-to-harmonics ratio, min Voice Range Profile(VRP), max VRP, VRP]을 실시하였다. 연구결과, 술 전에 비해 술 후 유의한 감소를 보였던 max VRP, VRP가 음성치료 직후 유의한 증가를 보였다. 이와 같은 결과는 본 연구의 음성치료법이 갑상선 술 후 음성문제의 주요한 증상인 고음역대 주파수 저하를 개선하는데 효과적인 방법임을 시사한다. 추후에는 본 치료효과가 장기간 지속되는지에 대한 연구가 필요할 것이다.

내전형 연축성 발성장애의 연속 발화 특성 (Characteristics of Connected Speech in ADSD)

  • 황연신;김재옥;최홍식
    • 말소리와 음성과학
    • /
    • 제1권1호
    • /
    • pp.93-98
    • /
    • 2009
  • The aim of this study was to investigate voice characteristics of adductive spasmodic dysphonia(ADSD) by measuring electroglottal and acoustic examination at the sentence level. The clinical records of 86 ADSD female patients (age group of $20{\sim}50$ years) and the control records of 86 normal females (age group of $20{\sim}40$ years) were recorded by speech studio(Laryngograph Ltd., UK). An independent t-test was used to compare ADSD and normal group. Results were as follows. (1) Fundamental frequency($F_0$) was significantly decreased in ADSD compared with normal group. (2) Irregularity of frequency and closed quotient(CQ) was significantly increased in ADSD compared with normal group. (3) Voiceless duration increased and voiced duration was significantly decreased in ADSD compared with normal group. (4) Fricative duration was increased in ADSD compared with normal group but it wasn't significant. In conclusion, strained, tight and choked voice shows an increase of CQ, tremor voice shows an increase of irregularity of frequency and less feminine voice shows decrease of $F_0$. Increase of voiceless duration and fricative duration and decrease of voiced duration related with diminution speech intelligibility.

  • PDF

각종 음성분석기에 따른 음성장애 환자의 주기간 주파수 및 진폭변동률 분석 (Jitter and Shimmer Measurements of Dysphonia among the Different Voice Analysis Programs)

  • 최성희;남도현;이승훈;정원혁;김덕원;최홍식
    • 대한후두음성언어의학회지
    • /
    • 제16권2호
    • /
    • pp.140-145
    • /
    • 2005
  • Background and Objectives : Voice perturbation measures, such as jitter and shimmer has been importantly used for diagnosis and treatment efficacy of laryngeal dysfunction. This study was conducted to investigate validity of newly developed multi-channel voice analyzer program by comparing with MDVP, PRAAT, TF32. In addition, we compared the voice perturbation measures with different voice analyzer program by type of signals. Materials and Methods : Nineteen mild-severe dysphonic patients participated in our study. Fundamental frequency, jitter and shimmer values were obtained from different voice analyzer program using the same sustained/ah/phonation. Results : Fundamental frequency and shimmer were highly correlated whereas jitter was weakly correlated between newly developed multi-channel voice analyzer program and the others though different pitch computation algorithm except MDVP, In addition, Type 2 and 3 signals were weakly correlated than Type 1. Conclusion : In the clinical setting, clinician may have sufficient information of voice analyzer and control conditions properly for severity of pathologic voice before voice perturbation measure to obtain reliable results.

  • PDF

차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출 (Voice Activity Detection Based on Entropy in Noisy Car Environment)

  • 노용완;이규범;이우석;홍광석
    • 융합신호처리학회논문지
    • /
    • 제9권2호
    • /
    • pp.121-128
    • /
    • 2008
  • 정확한 음성 구간 검출은 음성 인식 및 음성 코딩 그리고 음성 통신 시스템 등과 같은 음성 어플리케이션의 성능에 큰 영향을 미친다. 본 논문에서는 실제 운전하고 있는 상태에서 다양한 차량 노이즈 환경의 음성 구간 검출 방법을 제안한다. 기존의 음성 구간 검출은 시간 에너지, 주파수 에너지, 영 교차율, spectral entropy 등 다양한 방법을 사용하였으며 잡음 환경에서 급격하게 성능이 저하되는 단점이 있었다. 본 논문에서는 기존의 spectral entropy를 기반으로 하여 MFB(Mel-frequency Filter Banks) spectral entropy, 기울기 FFT(Fast Fourier Transform) spectral entropy, 기울기 MFB spectral entropy를 이용한 음성 구간 검출 방법을 제안한다. MFB는 멜 스케일과 FFT를 곱한 것으로 멜 스케일은 인간이 소리를 인지할 때 주파수에 대해 비선형적인 스케일이며 음성의 특징을 잘 반영한다. 제안한 MFB spectral entropy 방법은 다양한 차량 잡음 환경에서 음성 및 비음성 분별 능력을 향상시킬 수 있으며 실험 결과 93.21%의 음성 구간 검출율을 나타내었다. 이는 기존의 spectral entropy 방법과 비교할 때 MFB를 이용한 음성 구간 검출 방법이 3.2%의 검출율이 향상되었다.

  • PDF

The Influence of Noise Environment upon Voice and Data Transmission in the RF-CBTC System

  • Kim, Min-Seok;Lee, Sang-Hyeok;Lee, Jong-Woo
    • International Journal of Railway
    • /
    • 제3권2호
    • /
    • pp.39-45
    • /
    • 2010
  • The RF-CBTC (Radio Frequency-Communication Based Train Control) System is a communication system in railroad systems. The communication method of RF-CBTC system is the wireless between the wayside device and on-board device. The wayside device collects its location and speed from each train and transmits the distance from the forwarding train to the speed-limit position to it. The on-board device controlling device controls the speed optimum for the train. In the case of the RF-CBTC system used in Korea, transmission frequency is 2.4 [GHz]. It is the range of ISM(Industrial Scientific and Medical equipment) band and transmission of voice and data is performed by CDMA (Code Division Multiple Access) method. So noises are made in the AWGN (Additive White Gaussian Noise) and fading environment. Currently, the SNR (Signal to Noise Ratio) is about 20 [dB], so due to bit errors made by noises, transmission of reliable information to the train is not easy. Also, in the case that two tracks are put to a single direction, it is needed that two trains transmit reliable voice and data to a wayside device. But, by noises, it is not easy that just a train transmits reliable information. In this paper, we estimated the BER (Bit Error Rate) related to the SNR of voice and data transmission in the environment such as AWGN and fading from the RF-CBTC system using the CDMA method. Also, we supposed the SNR which is required to meet the BER standard for voice and data transmission. By increasing the processing gain that is a ratio of chip transmission to voice and data transmission, we made possible voice and data transmission from maximally two trains to a wayside device, and demonstrated it by using Matlab program.

  • PDF