• Title/Summary/Keyword: normal voice

Search Result 302, Processing Time 0.023 seconds

Acoustic Features of Oral Vowels in the Esophagus Speakers (식도음성의 모음종류에 따른 음향학적 특성)

  • Yun, Eunmi;Mok, Eunhee;Minh, Phan huu Ngoc;Hong, Kihwan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.85-92
    • /
    • 2015
  • This study aimed to establish characteristics related to voice and speech through the natural base frequency analysis of esophagus vocalization. In the study, 8 subjects were selected for esophagus vocals, and 10 other subjects were selected for a control group. MDVP(Multi-dimensional Voice Program, Model 4800, USA, 2001), Multi Speech(Model 3700, Kaypantax, USA, 2008) were used as experiment equipment. The speech samples selected for evaluation were vowels and sentences (both declarative and interrogative). For acoustic analysis, the intonation form of fo, jitter, energy, shimmer, HNR, and intonation patterns of the speech sample were measured. The results were as follows: First, the natural intrinsic frequency of extended vowels in the esophagus vocal group was lower than the frequency in the normal vocal group. In particular, the intrinsic frequency difference for high vowel /i/ was much greater than the frequency difference for low vowel /a/. Second, the jitter values of the esophagus vocal group were higher than the control group. In particular, there was a large difference between the jitter values for /a/ and /i/, with the jitter values being highest for /i/. Third, there was no significant difference in vocal strength between the esophagus vocal patient group and the control group. Fourth, the shimmer values of the voices in the esophagus vocal group were higher than shimmer values in the control group. In particular, there was a large difference in shimmer values for low vowel /a/. Fifth, the HNR values of the esophagus vocal group were showed significantly lower than the control group. In particular, the largest difference in HNR values between the two groups was for high vowel /i/. Sixth, the pitch contours of interrogative and declarative sentences of the esophagus vocal patient group showed a different form or only had with small differences compared to the pitch contours of the normal vocal group, thus presenting an inconsistent pattern.

Application of Machine Learning on Voice Signals to Classify Body Mass Index - Based on Korean Adults in the Korean Medicine Data Center (머신러닝 기반 음성분석을 통한 체질량지수 분류 예측 - 한국 성인을 중심으로)

  • Kim, Junho;Park, Ki-Hyun;Kim, Ho-Seok;Lee, Siwoo;Kim, Sang-Hyuk
    • Journal of Sasang Constitutional Medicine
    • /
    • v.33 no.4
    • /
    • pp.1-9
    • /
    • 2021
  • Objectives The purpose of this study was to check whether the classification of the individual's Body Mass Index (BMI) could be predicted by analyzing the voice data constructed at the Korean medicine data center (KDC) using machine learning. Methods In this study, we proposed a convolutional neural network (CNN)-based BMI classification model. The subjects of this study were Korean adults who had completed voice recording and BMI measurement in 2006-2015 among the data established at the Korean Medicine Data Center. Among them, 2,825 data were used for training to build the model, and 566 data were used to assess the performance of the model. As an input feature of CNN, Mel-frequency cepstral coefficient (MFCC) extracted from vowel utterances was used. A model was constructed to predict a total of four groups according to gender and BMI criteria: overweight male, normal male, overweight female, and normal female. Results & Conclusions Performance evaluation was conducted using F1-score and Accuracy. As a result of the prediction for four groups, The average accuracy was 0.6016, and the average F1-score was 0.5922. Although it showed good performance in gender discrimination, it is judged that performance improvement through follow-up studies is necessary for distinguishing BMI within gender. As research on deep learning is active, performance improvement is expected through future research.

Comparison of Maximum Phonation Time Associated with the Changes in Vocal Intensity in Patients with Unilateral Vocal Fold Palsy and Sulcus Vocalis (성대마비와 성대구증의 강도 변화에 따른 최대발성지속시간 비교)

  • Choi, Se-Jin;Choi, Hong-Shik;Kim, Jae-Ock;Choi, Yae-Lin
    • Phonetics and Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.125-131
    • /
    • 2012
  • The patients with incomplete glottic closure have an important feature decreasing the maximum phonation time (MPT) because airflow rate or air leakage is greater than people without voice disorders. Also they can appear a problem in the intensity regulation. This study analyzed MPT difference based on the comfortable intensity and louder intensity and the correlation between MPT and respiration volume of unilateral vocal fold palsy (UVFP) and sulcus vocalis (SV) group. The twenty with UVFP, the 21 with SV, the 21 normal subjects measured MPT in /a/ vowel prolongation task with comfortable intensity and louder intensity and compared analysis by measuring FVC, $FEV_1$, $FEV_1/FVC$ to analyze the correlation between MPT and respiration volume. First, a comparison of MPT according to the intensity between groups is that MPT of the normal group was statistically significant long compared to the patient group in comfortable intensity, but MPT between groups was not statistically significant difference in the louder intensity. Second, an analysis of the correlation between MPT and respiration volume is that this was statistically significant correlation between MPT in comfortable intensity and MPT in louder intensity. But this did not show statistically significant correlation between intensity and respiration volume. This study can be supported the preceding study results deduced that shorting MPT of the patient group compared to the normal group was originated in the problem of laryngeal valving mechanism at the level of vocal folds rather than a problem of respiratory function. Also at the phonation by varying the intensity, the result can deduce that in the case of patient group, the length of MPT had been improved by increasing the glottal closure ratio in the louder intensity. These results can support the theoretical basis that should be applied to the clinicians by varying the intensity at the voice evaluation and voice therapy for the patients with the glottis incompetence.

Voice Onset Time in Patients with Bilateral Vocal Nodules (양측성 성대결절 환자의 발성시작시간(VOT)에 관한 연구)

  • Park, Sun-Young;Kim, Seong-Tae;Kim, Sang-Yoon;Choi, Seung-Ho;Roh, Jong-Lyel;Nam, Soon-Yuhl
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.17 no.2
    • /
    • pp.107-110
    • /
    • 2006
  • Background and Objeetives : There are few studies reported that specifically examined the voice onset time(VOT) in patients with bilateral vocal nodules. The purpose of this study was to study the characteristics of voice onset in patients with bilateral vocal nodules. Materials and Methods : 52 female patients with bilateral vocal nodules were examined, aged from 20 to 54 years, and were compared with 25 normal female control group. All subjects produced five repetitions of the voiceless stops $/p^h,\;t^h\;k^h/$ in vowel context /ai_a/. VOT was measured by the time between the release of the stops and the onset of voicing. Results : VOTs of the voiceless stops $/p^h/\;and\;/t^h/$ in patients with bilateral vocal nodules were significantly shorter than those of normal subjects. VOT of the $/k^h/$ in them was shorter than those of normals, but the difference was not significant. This results showed that VOTs of the voiceless stops in patients with bilateral vocal nodules were shorter than those of normal subjects. Conclusion : The rapid onset of voicing in patients with bilateral vocal nodules might be associated with increased laryngeal muscle tension by hard glottal attacks. We suggest that VOT can be a clinically useful acoustic parameter for evaluating voice in patients with bilateral vocal nodules.

  • PDF

Mechanism of Vowel Phonation in T-E Shunt Patient using MR Imaging after Total Laryngectomy (후두 전적출술후 MR영상을 이용한 음성재활환자의 발성기전에 관한 연구)

  • Park, Byung-Rae
    • Journal of radiological science and technology
    • /
    • v.20 no.1
    • /
    • pp.21-27
    • /
    • 1997
  • Total laryngectomy has become an usual treatment for any advanced carcinoma of the laynx, but most patients who have undergone total laryngectomy have shown permanant disability in voice production. I compared the first three formant frequencies estimated from MRI to those measured directly from speech data of the T-E patients and the normal. It was to estimate the accuracy of MRI and to compare the vocal tract shape of the normal to T-E patients. The obtained results were as follows : 1. The middle sagittle section of the MRI represents vocal tract well during pnonation. The vocal tract shape of the T-E shunt patients are lack of pharyngeal space and superior space of the glottis. 2. The length of the normal subject's vocal tract is 17 cm. For the T-E shunt patients, the length from lip to shunt opening is 17.5 cm in case 1, and 18.5 cm in case 2. That of the true resonante chamber is 13 cm and 13.5 cm for each case respectively. 3. T-E shunt patients phonated strained voice. The intensity of the higher formant frequency decreased especially in /o/, /u/. 4. The vocal tract is shortened during the phonation by T-E shunt patients. In case of /e/ and /i/, front cavities are constricted while back cavities are shortened. 5. The pseudoglottis of the T-E shunt patients is located at $14{\sim}15\;cm$ below from lips.

  • PDF

Acoustic Analysis of Reinke Edema (라인케부종환자의 음성분석)

  • 김상균;최홍식;공석철;홍원표
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.7 no.1
    • /
    • pp.11-19
    • /
    • 1996
  • Reinke's edema is used for describing varying degrees of chronic swelling of the vocal folds. The acoustic analysis of Reinke's edema has not been reported so far in this country. The purpose of this study is to clarify acoustic and aerodynamic characteristics of the Reinke's edema. Several acoustic evaluations & aerodynamic studies were done in 20 Reinke's edema patients and the data was compared with those of 20 normal controls. Videolaryngoscopy also was done to classify the severity in grading. We used C-Speech, Doctor speech science, and Phonatory function analyser. In C-Speech, we compared jitter, shimmer, and SNR(signal to noise ratio) of normal and Rrinke's edema patient. In Doctor speech science, we compared NNE(Glottal noise energy), speech fundamental frequency, voice quality between two groups. And in phonatory function analyser for aerodynamic function test, we compared speech intensity, airflow rate, and expiratory pressure between two groups. In conclusion, Reinke's edema patients showed lower voice pitches than normal, additionally jitter, shimmer, SNR(signal to noise ratio), NNE(Glottal noise energy), airflow rate, and expiratory pressure may be meaningful parameters for diagnosis and prognosis for treatment.

  • PDF

Age and Sex Differences in Acoustic Parameter of Middle Age and Elderly Adult Voice (장.노년기 성인 음성의 성별과 연령에 따른 음향음성학적 특성 비교)

  • Lee, Hyo-Jin;Kim, Soo-Jin
    • MALSORI
    • /
    • no.60
    • /
    • pp.13-28
    • /
    • 2006
  • This study focused on comparing the following acoustic changes according to age and sex in adulthood: Fo, Jitter, Shimmer, and NHR. One hundred twenty Korean adults were divided into three age groups (20's, 50's, and 70's) and two sex groups (male and female). The subjects of this study performed three tasks: (1) sustained three vowels; (2) read on paragraph of 'Taking a Walk' (3) explained a picture. The data was analyzed using the MDVP of Multi-Speech. In the parameter of Fo, sex and age were influential factors. In the parameters of Jitter, Shimmer and NHR, the effect of sex and age was different in all three parameters. When the groups organized by sex were analyzed by age, the 20's group showed a statistical difference in all four parameters (Fo, Jitter, Shimmer, and NKR), when compared to the other two age ranges of 50's and 70's. We need to consider our standard parameter for the normal voice in the Korean elderly because the 50's and 70's age normal groups in our study are out of the current range of normal in MDVP.

  • PDF

Voice Changes in Women Treated for Endometriosis (자궁 내막증으로 치료 받은 여성들의 음성 변화)

  • 서민철;주준범;남순열
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.11 no.1
    • /
    • pp.46-50
    • /
    • 2000
  • Background and Objectives : Hormonal treatments which have an androgenic effect have the potential to cause vocal changes. The changes in vocal fold structure and voice quality are considered to be irreversible. To date, studies have documeted subjective vocal changes or documented single cases without detailed, baseline voice assessments. Materials and Methods : We have performed objective voice analyses of 20 women who were treated with androgenic hormones for endometriosis and compared the results with those of normal control women. Results : The averages of fundamental frequency were 194.7${\pm}$28.2 in study group, 207.0${\pm}$14.1 in control group. The means of closed quotient which were measured with electroglottography were 45.13${\pm}$2.06 in study group, 45.1${\pm}$3.03 in control group. Results of acoustic analysis are as follows. The averages of jitter were 0.95${\pm}$0.46 in study group, 1.10${\pm}$0.65 in control group. The means of shimmer were 2.44${\pm}$0.60 in study group, 2.32${\pm}$1.09 in control group. The averages of noise to harmonic ratio were 0.13${\pm}$0.028 in study group, 0.15$\pm$0.18 in control group. Conclusion : Although there were no statistically meaningful differences between the two groups, we could detect the masculinizing tendency of the therapeutic hormones of endomentriosiss(lowering of fundamental frequency). Given the availability of objective voice assessments today and the continued use of these potent hormones, comprehensive voice assessment and vocal monitoring would appear vital for women commencing hormonal treatment.

  • PDF

Clinical Characteristics of Functional Dysphonia (기능성 발성장애의 임상적 특성)

  • Suh, Woo-Jung;Hong, Young-Hye;Choi, Jong-Min;Jung, Eun-Jung;Sung, Myung-Whun;Kim, Kwang-Hyun;Kwon, Tack-Kyun
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.17 no.2
    • /
    • pp.127-132
    • /
    • 2006
  • Background and Objectives : Functional dysphonia is a voice disturbance in the absence of structural or neurologic laryngeal pathology characterized by voluntary misuse of laryngeal muscles. The present report reviews clinical characteristics of 25 patients with functional dysphonia. Materials and Method : We analyzed medical records, perceptual and acoustic analysis of voice samples, aerodynamic studies and laryngoscopy. Results : There was no sex or age predilection. Eighty four percent of patients presented sudden onset of symptoms and 76% had specific events at the onset. Most patients showed breathy or strained voice and various degree of vocal fold insufficiency with supraglottic compensatory contractions. Acoustic analysis revealed non-diagnostic, but mean flow rate was lower than normal in all cases. All patients responded to voice therapy except for 4 patients who were tort to follow up. Mean number of voice therapy sessions required to get responses is 1.9 sessions. Conclusion : We concluded that patients with functional dysphonia responded very well to short-term voice therapy and should be included in differential diagnosis in patients with dysphonia cannot be explained by structural or neurologic etiology.

  • PDF

Optimization of the packet size to enhance the voice quality of the VOIP system (VOIP 음질 개선을 위한 패킷 크기의 최적화)

  • 임강빈;정기현;최경희
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.9
    • /
    • pp.373-383
    • /
    • 2003
  • In this paper we discuss the effect of the delay limit and the packet size related to the quality of service on a VoIP system using the Internet. We also provide a guideline to determining the optimal packet size of the voice data for a given delay limit. Empirical studies are done with two personal computers connected through the packet switched public IP network. The sender encodes the voice signal from the microphone to get PCM and ADPCM data and sends the data to the receiver using UDP packets. The receiver plays the reconstructed voice from the stream with lost and delayed packets. The quality of the reconstructed voice is evaluated offline by the MNB (Measuring Normal Block) method using the data acquired from the both sides. The result shows that under the delay limit of 100ms for 40Kbps, 32Kbps and l6Kbps of ADPCM data, the minimum packet size should be 300bytes, 400bytes and 600bytes respectively and the maximum packet size should be l200bytes commonly for the best quality of voice.