통합 검색 | Korea Science

Discrimination of Pathological Speech Using Hidden Markov Models

Wang, Jianglin;Jo, Cheol-Woo
- 음성과학
- /
- 제13권3호
- /
- pp.7-18
- /
- 2006
Diagnosis of pathological voice is one of the important issues in biomedical applications of speech technology. This study focuses on the discrimination of voice disorder using HMM (Hidden Markov Model) for automatic detection between normal voice and vocal fold disorder voice. This is a non-intrusive, non-expensive and fully automated method using only a speech sample of the subject. Speech data from normal people and patients were collected. Mel-frequency filter cepstral coefficients (MFCCs) were modeled by HMM classifier. Different states (3 states, 5 states and 7 states), 3 mixtures and left to right HMMs were formed. This method gives an accuracy of 93.8% for train data and 91.7% for test data in the discrimination of normal and vocal fold disorder voice for sustained /a/.
PDF

성별에 따른 한국 정상 성인 음성의 음향학적 평가 기준치 (Acoustic Characteristics of the Voices of Korean Normal Adults by Gender on MDVP)

김재옥
- 말소리와 음성과학
- /
- 제1권4호
- /
- pp.147-157
- /
- 2009
The purpose of the study is to develop the normal voice database and to analyze the acoustic characteristics of Korean adults' voices by gender using MDVP. Eight categories in the 34 parameters of MDVP were analyzed in the voices of 170 Korean normal adults taken from /a/ vowel. Among them, Fundamental Frequency Parameters and Frequency Perturbation Parameters were significantly different by gender. In addition, Fundamental Frequency Parameters of our data were remarkably different from the data suggested in the MDVP program which currently used in clinics. Therefore, the data obtained from the current study can be effectively used for the diagnosis of voice disorders of Korean adults as the standard parameter values of MDVP.
PDF

일반화된 정규-라플라스 분포를 이용한 음성검출기 (Voice Activity Detection employing the Generalized Normal-Laplace Distribution)

김상균;권장우;이상민
- 한국멀티미디어학회논문지
- /
- 제17권3호
- /
- pp.294-299
- /
- 2014
본 논문에서는 일반화된 정규-라플라스(generalized normal-Laplace) 분포 기반의 음성 검출기(voice activity detection) 알고리즘을 제안한다. 제안된 알고리즘은, 잡음 섞인 음성 신호의 확률밀도함수를 일반화된 정규-라플라스 분포로 표현한 다음, 일반화된 정규-라플라스 분포의 음성과 잡음의 분산을 고차 모멘트(higher order moments)를 이용하여 추정한다. 제안된 알고리즘은 다양한 조건의 잡음 환경에서 기존의 음성 검출기들과 비교하였으며 향상된 성능을 보였다.
https://doi.org/10.9717/kmms.2014.17.3.294 인용 PDF KSCI KPUBS HTML

한국어판 음성장애지수와 음성관련 삶의 질의 타당도 및 신뢰도 연구 (Validity and Reliability of Korean-Version of Voice Handicap Index and Voice-Related Quality of Life)

김재옥;임성은;박선영;최성희;최재남;최홍식
- 음성과학
- /
- 제14권3호
- /
- pp.111-125
- /
- 2007
It is important to examine patients' subjective evaluation as well as objective measures and clinician's rating to assess voice disorders. This study aimed to evaluate validity and reliability of Korean-version of Voice Handicap Index (KVHI) and Voice-Related Quality of Life (KVQOL) with 113 adults with voice disorders and 111 normal adults. Content validity was verified by three experienced speech-language pathologists. Concurrent validity was revealed by examining the correlation among KVHI, KVQOL, and Voice Rating Scale as well as item discrimination coefficients. Total scores of KVHI and KVQOL of adults with voice disorders were significantly different from those of normal adults. Test-retest reliability and internal consistencies were significantly high in both KVHI and KVQOL. Correlations among scores of each subscale and total score were also significantly high in each tool. The study revealed that KVHI and KVQOL are suitable tools to be used in clinics and research areas in Korea, which can subjectively evaluate the effects of voice disorders on daily life as well as on quality of life.
PDF

기능적 음성장애인의 발성역치압력과 발성역치기류 특성 연구 (A Study on the Characteristics of Phonation Threshold Pressure and Phonation Threshold Airflow of Patients with Functional Voice Disorder)

이인애;윤주원;황영진
- 말소리와 음성과학
- /
- 제5권1호
- /
- pp.63-69
- /
- 2013
This study attempted to investigate the characteristics of Phonation Threshold Pressure and Phonation Threshold Airflow of Patients who have Functional voice disorder. 50 subjects participated in study (32 subjects were patients who had functional voice disorders and 20 subjects were normal adults). The PAS (Phonatory aerodynamic system, model 6600, KAY electronics, Inc.) was used to measure the data and to do the analysis. Data from the Phonation Threshold Pressure was measured using voicing efficiency of the PAS protocol. Data from the Phonation Threshold Airflow was measured using Maximum Sustained Phonation of the PAS protocol. Those were used because of the ease of phonation. The results of this study showed that the differences in Phonation Threshold Pressure and Phonation Threshold Airflow between patients who had functional voice disorder and normal adults could be significant index. Patients who had functional voice disorder showed more higher figures than normal adults. These results suggest that Phonation Threshold Pressure and Phonation Threshold Airflow are very useful in diagnosing the voice disorder. The measured data also provided useful information for diagnosing patients with vocal fold diseases.
https://doi.org/10.13064/KSSS.2013.5.1.063 인용 PDF

동일 후적자가 산출하는 기관식도 발성($PROVOX^{(R)}$ 발성)과 식도 발성에 대한 음향학적 및 공기역학적 특성 비교 (The Comparison of the Acoustic and Aerodynamic Characteristics of $PROVOX^{(R)}$ Voice and Esophageal Voice Produced by the Same Laryngectomee)

표화영;최홍식;임성은;최성희
- 음성과학
- /
- 제5권1호
- /
- pp.121-139
- /
- 1999
Our experimental subject was a laryngectomee who had undergone total laryngectomy with $PROVOX^{(R)}$ insertion, and learned esophageal speech after the surgery, so he could produce both $PROVOX^{(R)}$ voice and esophageal voice. With this subject's production of $PROVOX^{(R)}$ and esophageal voice, we are to compare the acoustic and aerodynamic characteristics of the two voices, under the same physical conditions of the same person. As a result, the fundamental frequency of esophageal voice was 137.2 Hz, and that of $PROVOX^{(R)}$ was 97.5 Hz. $PROVOX^{(R)}$ voice showed lower jitter, shimmer and NHR than esophageal voice, which means that $PROVOX^{(R)}$ voice showed better voice quality than esophageal voice. In spectrographic analysis, the formation of formants and pseudoformants were more distinct in esophageal voice and several temporal aspects of acoutic features such as VOT and closure duration were more similar with normal voice in $PROVOX^{(R)}$ voice. During the sentence utterance, esophageal voice showed longer pause or silence duration than $PROVOX^{(R)}$ voice. Maximum phonation time and mean flow rate of $PROVOX^{(R)}$ voice were much longer and larger than esophageal voice, but mean and range of sound pressure level, subglottic pressure and voice efficiency were similar in the two voices. Glottal resistance of esophageal voice was much larger than $PROVOX^{(R)}$ voice which showed still larger glottal resistance than normal voice.
PDF

음의 크기가 정상성인의 비음도에 미치는 영향 (The Effects of Vocal Loudness on Nasalance Measures of Normal Adults)

이수정;고도흥
- 음성과학
- /
- 제10권2호
- /
- pp.191-203
- /
- 2003
This study examined the effect of vocal loudness on nasalance measures, under the conditions of three sentence patterns (i.e., Oral sentences, Mixed sentences, Nasal sentences). The vocal loudness level was classified into soft voice (55 dB), medium voice (65 dB) and loud voice (75 dB). The participants in the present study were 30 normal adults (male: female =1:1). Kay's Nasometer 6200 was used to measure nasalance and Sound level meter was used to adjust the loudness level. The results of the present study are as follows. Firstly, the change in vocal loudness is in the following. In the Oral sentence stimuli, the loud voice for both male and female showed the highest nasalance degree, and the medium voice the lowest level. In the Mixed and Nasal sentence stimuli, however, male participants showed the highest degree of nasalance in the soft voice, and the lowest degree in the loud voice, and female showed the highest degree of nasalance in the soft voice and the lowest in the medium voice. Secondly, when each subject's nasalance scores were ranked in a ordered manner, noticeable tendency. Lowest nasalance score occurred in the loud voice and the highest nasalance score was recorded in the soft voice during participants' reading of the Nasal sentences. However, it was hard to find such pattern in the Oral sentences. It is assumed that velopharyngeal function could be related to these findings. Furthermore, the findings associated with vocal loudness may have diagnostic as well as clinical implications.
PDF

내전형 연축성 발성장애의 연속 발화 특성 (Characteristics of Connected Speech in ADSD)

황연신;김재옥;최홍식
- 말소리와 음성과학
- /
- 제1권1호
- /
- pp.93-98
- /
- 2009
The aim of this study was to investigate voice characteristics of adductive spasmodic dysphonia(ADSD) by measuring electroglottal and acoustic examination at the sentence level. The clinical records of 86 ADSD female patients (age group of $20{\sim}50$ years) and the control records of 86 normal females (age group of $20{\sim}40$ years) were recorded by speech studio(Laryngograph Ltd., UK). An independent t-test was used to compare ADSD and normal group. Results were as follows. (1) Fundamental frequency($F_0$) was significantly decreased in ADSD compared with normal group. (2) Irregularity of frequency and closed quotient(CQ) was significantly increased in ADSD compared with normal group. (3) Voiceless duration increased and voiced duration was significantly decreased in ADSD compared with normal group. (4) Fricative duration was increased in ADSD compared with normal group but it wasn't significant. In conclusion, strained, tight and choked voice shows an increase of CQ, tremor voice shows an increase of irregularity of frequency and less feminine voice shows decrease of $F_0$. Increase of voiceless duration and fricative duration and decrease of voiced duration related with diminution speech intelligibility.
PDF

한국 성인 음성의 음도인식에 관한 연구 (A Study on Pitch Perception of Normal Korean)

정옥란;김형순;김영태;서장수
- 음성과학
- /
- 제1권
- /
- pp.315-323
- /
- 1997
This study attempts to determine the fundamental frequency level of male and female voices that Koreans perceive as normal. Seventy-three college students majoring in Speech Pathology participated in the study on a voluntary basis. The subjects listened to a male voice with fundamental frequency of 60 Hz, 80 Hz, 100 Hz, 120 Hz, 140 Hz, 160 Hz, 180 Hz, and 200 Hz, and a female voice with fundamental frequency of 140 Hz, 160 Hz, 180 Hz, 200 Hz, 220 Hz, 240 Hz, 260 Hz, and 280 Hz. The PSOLA (Pitch Synchronous Overlap). method and harmonic modeling method of speech signal were used to change pitch in the 20 Hz interval. The voices were presented in a random order to prevent listener bias. The results were as follows; Firstly, $46.6\%$ judged male voice with 120 Hz as normal, and $19.2\%$ judged 140 Hz as normal, and another $19.2\%$ judged 160 Hz as normal. Secondly, $50.7\%$ perceived female voice with 220 Hz as normal, and $32.9\%\;and\;30.1\%$ responded to 200 Hz and 240 Hz, respectively. The problems and recommendations for a future investigation are discussed.
PDF

양성후두 질환의 지속모음을 대상으로 한 기존 피치 추정 방법들의 성능 비교 분석 (Comparative Analysis of Performance of Established Pitch Estimation Methods in Sustained Vowel of Benign Vocal Fold Lesions)

장승진;김효민;최성희;박영철;최홍식;윤영로
- 음성과학
- /
- 제14권4호
- /
- pp.179-200
- /
- 2007
In voice pathology, various measurements calculated from pitch values are proposed to show voice quality. However, those measurements frequently seem to be inaccurate and unreliable because they are based on some wrong pitch values determined from pathological voice data. In order to solve the problem, we compared several pitch estimation methods to propose a better one in pathological voices. From the database of 99 pathological voice and 30 normal voice data, errors derived from pitch estimation were analyzed and compared between pathological and normal voice data or among the vowels produced by patients with benign vocal fold lesions. Results showed that gross pitch errors were observed in the cases of pathological voice data. From the types of pathological voices classified by the degree of aperiodicity in the speech signals, we found that pitch errors were closely related to the number of aperiodic segments. Also, the autocorrelation approach was found to be the most robust pitch estimation in the pathological voice data. It is desirable to conduct further research on the more severely pathological voice data in order to reduce pitch estimation errors.
PDF

검색결과 303건 처리시간 0.02초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)