• 제목/요약/키워드: Pathological Voice

검색결과 38건 처리시간 0.019초

Classification of Pathological Voice Using Artigicial Neural Network with Normalized Parameters

  • Li, Tao;Bak, Il-Suh;Jo, Cheol-Woo
    • 음성과학
    • /
    • 제11권1호
    • /
    • pp.21-29
    • /
    • 2004
  • In this paper we examined the effect of normalization on discriminating the pathological voice into normal and abnormal classes using artificial neural network. Average values per each parameter were used to normalize each set of parameter values. Artificial neural networks were used as classifiers. And the effect of normalization was evaluated by comparing the discrimination results between original and normalized parameter sets.

  • PDF

피처벡터 축소방법에 기반한 장애음성 분류 (Classification of pathological and normal voice based on dimension reduction of feature vectors)

  • 이지연;정상배;최홍식;한민수
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2007년도 한국음성과학회 공동학술대회 발표논문집
    • /
    • pp.123-126
    • /
    • 2007
  • This paper suggests a method to improve the performance of the pathological/normal voice classification. The effectiveness of the mel frequency-based filter bank energies using the fisher discriminant ratio (FDR) is analyzed. And mel frequency cepstrum coefficients (MFCCs) and the feature vectors through the linear discriminant analysis (LDA) transformation of the filter bank energies (FBE) are implemented. This paper shows that the FBE LDA-based GMM is more distinct method for the pathological/normal voice classification than the MFCC-based GMM.

  • PDF

Classification of Pathological Voice Signal with Severe Noise Component

  • Li, Ta-O;Jo, Cheol-Woo
    • 음성과학
    • /
    • 제10권4호
    • /
    • pp.107-115
    • /
    • 2003
  • In this paper we tried to classify the pathological voice signal with severe noise component based on two different parameters, the spectral slope and the ratio of energies in the harmonic and noise components (HNR), The spectral slope is obtained by using a curve fitting method and the HNR is computed in cepstrum quefrency domain. Speech data from normal peoples and patients are collected, diagnosed and divided into three different classes (normal, relatively less noisy and severely noisy data), The mean values and the standard deviations of the spectral slope and the HNR are computed and compared with in the three kinds of data to characterize and classify the severely noisy pathological voice signals from others.

  • PDF

양성후두 질환 음성에 대한 여러 기존 피치검출 알고리즘의 성능 평가 (Performance Assessment of Several Established Pitch Detection Algorithms in Voices of Benign Vocal Fold Lesions)

  • 장승진;최성희;김효민;최홍식;윤영로
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2007년도 하계종합학술대회 논문집
    • /
    • pp.407-408
    • /
    • 2007
  • Robust pitch estimation is an important study in many areas of speech processing. In voice pathology, diverse statistics extracted form pitch were commonly used to test voice quality. In this study, we compared several established pitch detection algorithms (PDAs) for verification of adequacy of the PDAs. In the database of total pathological voices of 99 and normal voices of 30, an analysis of errors related with pitch detection was evaluated between pathological and normal voices, or among the types of pathological voices such as benign vocal fold lesions; polyp, nodule, and cysts. Consequently, it is required to survey the severity of tested voice in order to obtain accurate pitch estimates.

  • PDF

장애음성의 주기성분과 잡음성분의 분리 방법에 관하여 (Separation of Periodic and Aperiodic Components of Pathological Speech Signal)

  • 조철우;리타오
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.25-28
    • /
    • 2003
  • The aim of this paper is to analyze the pathological voice by separating signal into periodic and aperiodic part. Separation was peformed recursively from the residual signal of voice signal. Based on initial estimation of aperiodic part of spectrum, aperiodic part is decided from the extrapolation method. Periodic part is decided by subtracting aperiodic part from the original spectrum. A parameter HNR is derived based on the separation. Parameter value statistics are compared with those of Jitter and Shimmer for normal, benign and malignant cases.

  • PDF

Simulink를 이용한 음원모델 시뮬레이터 구현 (Implementation of Voice Source Simulator Using Simulink)

  • 조철우;김재희
    • 말소리와 음성과학
    • /
    • 제3권2호
    • /
    • pp.89-96
    • /
    • 2011
  • In this paper, details of the design and implementation of a voice source simulator using Simulink and Matlab are discussed. This simulator is an implementation by model-based design concept. Voice sources can be analyzed and manipulated through various factors by choosing options from GUI input and selecting pre-defined blocks or user created ones. This kind of simulation tool can simplify the procedure of analyzing speech signals for various purposes such as voice quality analysis, pathological voice analysis, and speech coding. Also, basic analysis functions are supported to compare the original signal and the manipulated ones.

  • PDF

병리적 음성에 대한 언어습득 이후 인공와우이식 성인의 청지각적 변별특성과 중재 프로그램의 효과 (The Effect on Intervention Program and Auditory-Perceptual Discrimination Feature of Postlingual Cochlear Implant Adults about Pathological Voice)

  • 배인호;김근효;이연우;박희준;김진동;이일우;권순복
    • 말소리와 음성과학
    • /
    • 제7권2호
    • /
    • pp.9-17
    • /
    • 2015
  • In the present study, we investigated ability of recognition of auditory perception with regards to the quality of voice in postlingual CI adults and proposed a training program to improve within subject reliability. A prospective case-control study was conducted in adults with 7 postlingual deaf who received a CI surgery and 10 normal hearing controls. The pre and post test and training program included parameters of consensus auditory-perceptual evaluation of voice(CAPE-V) with pathological voice sample by using Alvin. In results of pre-post test for monitoring improvements of internal reliability for listeners via the training program, there was statistically significant difference in both test and group. There was statistically significant difference in internal reliability between pre-post test in the normal hearing group, the result was no significant in the CI group. The present study found that CI adults showed less ability in awareness of voice quality compared to normal hearing group. Also the training program improved pitch and loudness in CI adults.

신경회로망을 이용한 ARS 장애음성의 식별에 관한 연구 (Classification of Pathological Voice from ARS using Neural Network)

  • 조철우;김광인;김대현;권순복;김기련;김용주;전계록;왕수건
    • 음성과학
    • /
    • 제8권2호
    • /
    • pp.61-71
    • /
    • 2001
  • Speech material, which is collected from ARS(Automatic Response System), was analyzed and classified into disease and non-disease state. The material include 11 different kinds of diseases. Along with ARS speech, DAT(Digital Audio Tape) speech is collected in parallel to give the bench mark. To analyze speech material, analysis tools, which is developed local laboratory, are used to provide an improved and robust performance to the obtained parameters. To classify speech into disease and non-disease class, multi-layered neural network was used. Three different combinations of 3, 6, 12 parameters are tested to obtain the proper network size and to find the best performance. From the experiment, the classification rate of 92.5% was obtained.

  • PDF

갑상선 수술 후 성대마비 환자의 기식 음성에 대한 공기역학적 및 음향적 분석 (An Aerodynamic and Acoustic Analysis of the Breathy Voice of Thyroidectomy Patients)

  • 강영애;윤규철;김재옥
    • 말소리와 음성과학
    • /
    • 제4권2호
    • /
    • pp.95-104
    • /
    • 2012
  • Thyroidectomy patients may have vocal paralysis or paresis, resulting in a breathy voice. The aim of this study was to investigate the aerodynamic and acoustic characteristics of a breathy voice in thyroidectomy patients. Thirty-five subjects who have vocal paralysis after thyroidectomy participated in this study. According to perceptual judgements by three speech pathologists and one phonetic scholar, subjects were divided into two groups: breathy voice group (n = 21) and non-breathy voice group (n = 14). Aerodynamic analysis was conducted by three tasks (Voicing Efficiency, Maximum Sustained Phonation, Vital Capacity) and acoustic analysis was measured during Maximum Sustained Phonation task. The breathy voice group had significantly higher subglottal pressure and more pathological voice characteristics than the non breathy voice group. Showing 94.1% classification accuracy in result logistic regression of aerodynamic analysis, the predictor parameters for breathiness were maximum sound pressure level, sound pressure level range, phonation time of Maximum Sustained Phonation task and Pitch range, peak air pressure, and mean peak air pressure of Voicing Efficiency task. Classification accuracy of acoustic logistic regression was 88.6%, and five frequency perturbation parameters were shown as predictors. Vocal paralysis creates air turbulence at the glottis. It fluctuates frequency-related parameters and increases aspiration in high frequency areas. These changes determine perceptual breathiness.

음성인식프로그램을 이용한 무후두 음성의 말 명료도와 병적 음성의 수술 전후 개선도 측정 (Speech Intelligibility of Alaryngeal Voices and Pre/Post Operative Evaluation of Voice Quality using the Speech Recognition Program(HUVOIS))

  • 김한수;최성희;김재인;임재열;최홍식
    • 대한후두음성언어의학회지
    • /
    • 제15권2호
    • /
    • pp.92-97
    • /
    • 2004
  • Background and Objectives : The purpose of this study was to examine objectively pre and post operative voice quality evaluation and intelligibility of alaryngeal voice using speech recognition program, HUVOIS. Materials and Methods : 2 laryngologists and 1 speech pathologist were evaluated 'G', 'R', 'B' in the GRBAS sclae and speech intelligibility using NTID rating scale from standard paragraph. And also acoustic estimates such as jitter, shimmer, HNR were obtained from Lx Speech Studio. Results : Speech recognition rate was not significantly different between pre and post operation for pathological vocie samples though voice quality(G, B) and acoustic values(Jitter, HNR) were significantly improved after post operation. In Alaryngeal voices, reed type electrolarynx 'Moksori' was the highest both speech intelligibility and speech recognition rate, whereas esophageal speech was the lowest. Coefficient correlation of speech intelligibility and speech recognition rate was found in alaryngeal voices, but not in pathological voices. Conclusion : Current study was not proved speech recognition program, HUVOIS during telephone program was not objective and efficient method for assisting subjective GRBAS scale.

  • PDF