Search | Korea Science

Classification of Pathological Voice Using Artigicial Neural Network with Normalized Parameters

Li, Tao;Bak, Il-Suh;Jo, Cheol-Woo
- Speech Sciences
- /
- v.11 no.1
- /
- pp.21-29
- /
- 2004
In this paper we examined the effect of normalization on discriminating the pathological voice into normal and abnormal classes using artificial neural network. Average values per each parameter were used to normalize each set of parameter values. Artificial neural networks were used as classifiers. And the effect of normalization was evaluated by comparing the discrimination results between original and normalized parameter sets.
PDF

Classification of pathological and normal voice based on dimension reduction of feature vectors (피처벡터 축소방법에 기반한 장애음성 분류)

Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
- Proceedings of the KSPS conference
- /
- 2007.05a
- /
- pp.123-126
- /
- 2007
This paper suggests a method to improve the performance of the pathological/normal voice classification. The effectiveness of the mel frequency-based filter bank energies using the fisher discriminant ratio (FDR) is analyzed. And mel frequency cepstrum coefficients (MFCCs) and the feature vectors through the linear discriminant analysis (LDA) transformation of the filter bank energies (FBE) are implemented. This paper shows that the FBE LDA-based GMM is more distinct method for the pathological/normal voice classification than the MFCC-based GMM.
PDF

Classification of Pathological Voice Signal with Severe Noise Component

Li, Ta-O;Jo, Cheol-Woo
- Speech Sciences
- /
- v.10 no.4
- /
- pp.107-115
- /
- 2003
In this paper we tried to classify the pathological voice signal with severe noise component based on two different parameters, the spectral slope and the ratio of energies in the harmonic and noise components (HNR), The spectral slope is obtained by using a curve fitting method and the HNR is computed in cepstrum quefrency domain. Speech data from normal peoples and patients are collected, diagnosed and divided into three different classes (normal, relatively less noisy and severely noisy data), The mean values and the standard deviations of the spectral slope and the HNR are computed and compared with in the three kinds of data to characterize and classify the severely noisy pathological voice signals from others.
PDF

Performance Assessment of Several Established Pitch Detection Algorithms in Voices of Benign Vocal Fold Lesions (양성후두 질환 음성에 대한 여러 기존 피치검출 알고리즘의 성능 평가)

Jang, Seung-Jin;Choi, Seong-Hee;Kim, Hyo-Min;Choi, Hong-Shik;Yoon, Young-Ro
- Proceedings of the IEEK Conference
- /
- 2007.07a
- /
- pp.407-408
- /
- 2007
Robust pitch estimation is an important study in many areas of speech processing. In voice pathology, diverse statistics extracted form pitch were commonly used to test voice quality. In this study, we compared several established pitch detection algorithms (PDAs) for verification of adequacy of the PDAs. In the database of total pathological voices of 99 and normal voices of 30, an analysis of errors related with pitch detection was evaluated between pathological and normal voices, or among the types of pathological voices such as benign vocal fold lesions; polyp, nodule, and cysts. Consequently, it is required to survey the severity of tested voice in order to obtain accurate pitch estimates.
PDF

Separation of Periodic and Aperiodic Components of Pathological Speech Signal (장애음성의 주기성분과 잡음성분의 분리 방법에 관하여)

Jo Cheolwoo;Li Tao
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.25-28
- /
- 2003
The aim of this paper is to analyze the pathological voice by separating signal into periodic and aperiodic part. Separation was peformed recursively from the residual signal of voice signal. Based on initial estimation of aperiodic part of spectrum, aperiodic part is decided from the extrapolation method. Periodic part is decided by subtracting aperiodic part from the original spectrum. A parameter HNR is derived based on the separation. Parameter value statistics are compared with those of Jitter and Shimmer for normal, benign and malignant cases.
PDF

Implementation of Voice Source Simulator Using Simulink (Simulink를 이용한 음원모델 시뮬레이터 구현)

Jo, Cheol-Woo;Kim, Jae-Hee
- Phonetics and Speech Sciences
- /
- v.3 no.2
- /
- pp.89-96
- /
- 2011
In this paper, details of the design and implementation of a voice source simulator using Simulink and Matlab are discussed. This simulator is an implementation by model-based design concept. Voice sources can be analyzed and manipulated through various factors by choosing options from GUI input and selecting pre-defined blocks or user created ones. This kind of simulation tool can simplify the procedure of analyzing speech signals for various purposes such as voice quality analysis, pathological voice analysis, and speech coding. Also, basic analysis functions are supported to compare the original signal and the manipulated ones.
PDF

The Effect on Intervention Program and Auditory-Perceptual Discrimination Feature of Postlingual Cochlear Implant Adults about Pathological Voice (병리적 음성에 대한 언어습득 이후 인공와우이식 성인의 청지각적 변별특성과 중재 프로그램의 효과)

Bae, Inho;Kim, Geunhyo;Lee, Yeonwoo;Park, Heejune;Kim, Jindong;Lee, Ilwoo;Kwon, Soonbok
- Phonetics and Speech Sciences
- /
- v.7 no.2
- /
- pp.9-17
- /
- 2015
In the present study, we investigated ability of recognition of auditory perception with regards to the quality of voice in postlingual CI adults and proposed a training program to improve within subject reliability. A prospective case-control study was conducted in adults with 7 postlingual deaf who received a CI surgery and 10 normal hearing controls. The pre and post test and training program included parameters of consensus auditory-perceptual evaluation of voice(CAPE-V) with pathological voice sample by using Alvin. In results of pre-post test for monitoring improvements of internal reliability for listeners via the training program, there was statistically significant difference in both test and group. There was statistically significant difference in internal reliability between pre-post test in the normal hearing group, the result was no significant in the CI group. The present study found that CI adults showed less ability in awareness of voice quality compared to normal hearing group. Also the training program improved pitch and loudness in CI adults.
https://doi.org/10.13064/KSSS.2015.7.2.009 인용 PDF KSCI

Classification of Pathological Voice from ARS using Neural Network (신경회로망을 이용한 ARS 장애음성의 식별에 관한 연구)

Jo, C.W.;Kim, K.I.;Kim, D.H.;Kwon, S.B.;Kim, K.R.;Kim, Y.J.;Jun, K.R.;Wang, S.G.
- Speech Sciences
- /
- v.8 no.2
- /
- pp.61-71
- /
- 2001
Speech material, which is collected from ARS(Automatic Response System), was analyzed and classified into disease and non-disease state. The material include 11 different kinds of diseases. Along with ARS speech, DAT(Digital Audio Tape) speech is collected in parallel to give the bench mark. To analyze speech material, analysis tools, which is developed local laboratory, are used to provide an improved and robust performance to the obtained parameters. To classify speech into disease and non-disease class, multi-layered neural network was used. Three different combinations of 3, 6, 12 parameters are tested to obtain the proper network size and to find the best performance. From the experiment, the classification rate of 92.5% was obtained.
PDF

An Aerodynamic and Acoustic Analysis of the Breathy Voice of Thyroidectomy Patients (갑상선 수술 후 성대마비 환자의 기식 음성에 대한 공기역학적 및 음향적 분석)

Kang, Young-Ae;Yoon, Kyu-Chul;Kim, Jae-Ock
- Phonetics and Speech Sciences
- /
- v.4 no.2
- /
- pp.95-104
- /
- 2012
Thyroidectomy patients may have vocal paralysis or paresis, resulting in a breathy voice. The aim of this study was to investigate the aerodynamic and acoustic characteristics of a breathy voice in thyroidectomy patients. Thirty-five subjects who have vocal paralysis after thyroidectomy participated in this study. According to perceptual judgements by three speech pathologists and one phonetic scholar, subjects were divided into two groups: breathy voice group (n = 21) and non-breathy voice group (n = 14). Aerodynamic analysis was conducted by three tasks (Voicing Efficiency, Maximum Sustained Phonation, Vital Capacity) and acoustic analysis was measured during Maximum Sustained Phonation task. The breathy voice group had significantly higher subglottal pressure and more pathological voice characteristics than the non breathy voice group. Showing 94.1% classification accuracy in result logistic regression of aerodynamic analysis, the predictor parameters for breathiness were maximum sound pressure level, sound pressure level range, phonation time of Maximum Sustained Phonation task and Pitch range, peak air pressure, and mean peak air pressure of Voicing Efficiency task. Classification accuracy of acoustic logistic regression was 88.6%, and five frequency perturbation parameters were shown as predictors. Vocal paralysis creates air turbulence at the glottis. It fluctuates frequency-related parameters and increases aspiration in high frequency areas. These changes determine perceptual breathiness.
https://doi.org/10.13064/KSSS.2012.4.2.095 인용 PDF

Speech Intelligibility of Alaryngeal Voices and Pre/Post Operative Evaluation of Voice Quality using the Speech Recognition Program(HUVOIS) (음성인식프로그램을 이용한 무후두 음성의 말 명료도와 병적 음성의 수술 전후 개선도 측정)

Kim, Han-Su;Choi, Seong-Hee;Kim, Jae-In;Lee, Jae-Yol;Choi, Hong-Shik
- Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
- /
- v.15 no.2
- /
- pp.92-97
- /
- 2004
Background and Objectives : The purpose of this study was to examine objectively pre and post operative voice quality evaluation and intelligibility of alaryngeal voice using speech recognition program, HUVOIS. Materials and Methods : 2 laryngologists and 1 speech pathologist were evaluated 'G', 'R', 'B' in the GRBAS sclae and speech intelligibility using NTID rating scale from standard paragraph. And also acoustic estimates such as jitter, shimmer, HNR were obtained from Lx Speech Studio. Results : Speech recognition rate was not significantly different between pre and post operation for pathological vocie samples though voice quality(G, B) and acoustic values(Jitter, HNR) were significantly improved after post operation. In Alaryngeal voices, reed type electrolarynx 'Moksori' was the highest both speech intelligibility and speech recognition rate, whereas esophageal speech was the lowest. Coefficient correlation of speech intelligibility and speech recognition rate was found in alaryngeal voices, but not in pathological voices. Conclusion : Current study was not proved speech recognition program, HUVOIS during telephone program was not objective and efficient method for assisting subjective GRBAS scale.
PDF

Search Result 38, Processing Time 0.021 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)