• Title/Summary/Keyword: voice index

Search Result 147, Processing Time 0.023 seconds

Impact of face masks on spectral and cepstral measures of speech: A case study of two Korean voice actors (한국어 스펙트럼과 캡스트럼 측정시 안면마스크의 영향: 남녀 성우 2인 사례 연구)

  • Wonyoung Yang;Miji Kwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.4
    • /
    • pp.422-435
    • /
    • 2024
  • This study intended to verify the effects of face masks on the Korean language in terms of acoustic, aerodynamic, and formant parameters. We chose all types of face masks available in Korea based on filter performance and folding type. Two professional voice actors (a male and a female) with more than 20 years of experience who are native Koreans and speak standard Korean participated in this study as speakers of voice data. Face masks attenuated the high-frequency range, resulting in decreased Vowel Space Area (VSA) and Vowel Articulation Index (VAI)scores and an increased Low-to-High spectral ratio (L/H ratio) in all voice samples. This can result in lower speech intelligibility. However, the degree of increment and decrement was based on the voice characteristics. For female speakers, the Speech Level (SL) and Cepstral Peak Prominence (CPP) increased with increasing face mask thickness. In this study, the presence or filter performance of a face mask was found to affect speech acoustic parameters according to the speech characteristics. Face masks provoked vocal effort when the vocal intensity was not sufficiently strong, or the environment had less reverberance. Further research needs to be conducted on the vocal efforts induced by face masks to overcome acoustic modifications when wearing masks.

Dysphonia : Vocal Fold Mucosal Lesions Easily Missed in Laryngoscopy (발성장애: 후두내시경 검사에서 놓치기 쉬운 성대점막질환)

  • Kim, Han-Su
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.21 no.1
    • /
    • pp.17-21
    • /
    • 2010
  • Dysphonia is a medical terminology for voice disorders characterized by hoarseness, harshness, weakness, or even loss of voice ; any impairment in ability to produce voice sounds using the vocal organs, larynx, The causes of dysphonia can be classified into two groups, organic and functional. Functional dysphonia includes spasmodic dysphonia, muscle tension dysphonia, mutational dysphonia and conversion dysphonia, etc, The findings of laryngoscopy in these dysphonia are almost normal. Therefore, physicians should diagnosis these diseases from careful history taking and abundant understandings about the phonation pattern, Organic dysphonia is caused by anatomical problems in the larynx, especially on the vocal fold, Some lesions, however, are not easily found because these lesions are too small, or located on the lower lip of vibrating vocal fold. Laryngopharyngeal reflux induced laryngitis, vascular lesions, sulcus vocalis, vocal atropy including presbylaryngis, and mucosal tears are common lesions easily missed in laryngoscopy, Therefore, a high index of suspicion is necessary to avoid missing vocal fold mucosal lesions, and the strobovideolaryngoscopy is indispensable in making the diagnosis,

  • PDF

The Effect of Frequency and Intensity of /a/ Phonation on the Result of Acoustic Analysis (발성시 음도 및 강도의 변화가 음성분석검사 결과에 미치는 영향)

  • 손영익;윤영선;권중근;추광철
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.8 no.1
    • /
    • pp.12-17
    • /
    • 1997
  • Measuring phonatory stability using MDVP(Multi-dimensional voice program, Kay Elemetrics Corp., NJ, USA) are becoming popular in many Korean clinics and laboratories, yet questions about standardization and reference values have remained. The purpose of present study was to examine the effects of frequency and intensity variation on the results of acoustic analysis related to phonatory stability. Twenty young adults(ten females and ten males) were asked to sustain vowel /a/ for more than 3 seconds under 9 different pitch and loudness conditions. Using MDVP, nine voice samples were analyzed, and jitter percent, fundamental frequency variation, shimmer percent, peak amplitude variation, noise to harmonic ratio, amplitude tremor intensity index, and degree of subharmonics were compared. The results showed that intensity changes can significantly affect various phonatory stability measures, and the lowest perturbation values can be obtained from slightly louder(10dB) phonatory condition than comfortable level phonation.

  • PDF

Optimization of State-Based Real-Time Speech Endpoint Detection Algorithm (상태변수 기반의 실시간 음성검출 알고리즘의 최적화)

  • Kim, Su-Hwan;Lee, Young-Jae;Kim, Young-Il;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.2 no.4
    • /
    • pp.137-143
    • /
    • 2010
  • In this paper, a speech endpoint detection algorithm is proposed. The proposed algorithm is a kind of state transition-based ones for speech detection. To reject short-duration acoustic pulses which can be considered noises, it utilizes duration information of all detected pulses. For the optimization of parameters related with pulse lengths and energy threshold to detect speech intervals, an exhaustive search scheme is adopted while speech recognition rates are used as its performance index. Experimental results show that the proposed algorithm outperforms the baseline state-based endpoint detection algorithm. At 5 dB input SNR for the beamforming input, the word recognition accuracies of its outputs were 78.5% for human voice noises and 81.1% for music noises.

  • PDF

Indexing and Retrieval of Human Individuals on Video Data Using Face and Speaker Recognition

  • Y.Sugiyama;N.Ishikawa;M.Nishida;Y.Ariki
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1998.06b
    • /
    • pp.122-127
    • /
    • 1998
  • In this paper, we focus on the information retrieval of human individuals who are recorded on the video database. Our purpose is to index persons by their faces or voice and to retrieve their existing time sections on the video data. The database system can track as well as extract a face or voice of a certain person and construct a model of the individual person in self-organization mode. If he appears again at different time, the system can put the mark of the same person to the associated frames. In this way, the same person can be retrieved even if the system does not know his exact name. As the face and speaker modeling, a subspace method is employed to improve the indexing accuracy.

  • PDF

Comparison of Vowel and Text-Based Cepstral Analysis in Dysphonia Evaluation (발성장애 평가 시 /a/ 모음연장발성 및 문장검사의 켑스트럼 분석 비교)

  • Kim, Tae Hwan;Choi, Jeong Im;Lee, Sang Hyuk;Jin, Sung Min
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.26 no.2
    • /
    • pp.117-121
    • /
    • 2015
  • Background : Cepstral analysis which is obtained from Fourier transformation of spectrum has been known to be effective indicator to analyze the voice disorder. To evaluate the voice disorder, phonation of sustained vowel /a/ sound or continuous speech have been used but the former was limited to capture hoarseness properly. This study is aimed to compare the effectiveness in analysis of cepstrum between the sustained vowel /a/ sound and continuous speech. Methods : From March 2012 to December 2014, total 72 patients was enrolled in this study, including 24 unilateral vocal cord palsy, vocal nodule and vocal polyp patients, respectively. The entire patient evaluated their voice quality by VHI (Voice Handicap Index) before and after treatment. Phonation of sustained vowel /a/ sample and continuous speech using the first sentence of autumn paragraph was subjected by cepstral analysis and compare the pre-treatment group and post-treatment group. Results : The measured values of pre and post treatment in CPP-a (cepstral peak prominence in /a/ vowel sound) was 13.80, 13.91 in vocal cord palsy, 16.62, 17.99 in vocal cord nodule, 14.19, 18.50 in vocal cord polyp respectively. Values of CPP-s (cepstral peak prominence in text-based speech) in pre and post treatment was 11.11, 12.09 in vocal cord palsy, 12.11, 14.09 in vocal cord nodule, 12.63, 14.17 in vocal cord polyp. All 72 patients showed subjective improvement in VHI after treatment. CPP-a showed statistical improvement only in vocal polyp group, but CPP-s showed statistical improvement in all three groups (p<0.05). Conclusion : In analysis of cepstrum, text-based analysis is more representative in voice disorder than vowel sound speech. So when the acoustic analysis of voice by cepstrum, both phonation of sustained vowel /a/ sound and text based speech should be performed to obtain more accurate result.

  • PDF

Analysis of Pre and Post-Operative Speech In Combined Operation of Type I Thyroplasty and Arytenoid Adduction for Unilateral Vocal Cord Palsy (편측성대마비에 대한 제 1형 갑상성형술과 피열연골내전술의 동시수술시 술전 및 술후 음성언어분석비교)

  • 최홍식;정유삼;김성국;김영호;김광문
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.9 no.1
    • /
    • pp.66-70
    • /
    • 1998
  • Background and Objectives : The managements of unilateral vocal cord palsy include type Ⅰ thyroplasty and arytenoid adduction. One type operation has been shown no satisfactory effect. We evaluated preoperative and postoperative speech of unilateral vocal cord palsy patients who received combined operation of type Ⅰ thyroplasty and arytenoid adduction to help for the management plan of unilateral vocal cord palsy patients. Materials and Methods : We reviewed the postoperative results and complication of 17 surgically treated patients of unilateral vocal cord palsy at Severance hospital from Nov. 1996 to Dec. 1997 retrospectively. They were received combined operation of type Ⅰ thyroplasty and arytenoid adduction. Their pre and post-operative speech were analyzed with MDVP(Multi-Dimension-Voice analysis Program) of CSL(Computerized Speech Lab). Results : After the operation, MPT(Maximal Phonation Time) was increased and MFR(Mean Flow Rate) was decreased in all patients. NHR(Noise to Harmonic Ratio) and VTI(Voice Turbulence Index) were decreased : liner, RAP(Relative Average Perturbation Quotient), PPQ(Pitch Period Perturbation Quotient), sPPQ(smoothed Pitch Period Perturbation Quotient), vFo(fundamental frequency Variation) were decreased : Shimmer, APQ(Amplitude Perturbation Quotient), sAPQ(Smoothed Amplitude Perturbation Qoutient), vAm(Peak Amplitude Variation) were decreased in all the patients. Conclusions : In unilateral vocal cord pals), combined operation of type Ⅰ thyroplasty and arytenoid adduction could obtain satisfactory postoperative voice. MDVP has many parameters and good method for evaluation of voice surgery.

  • PDF

Study for Extraction of Stable Vocal Features and Definition of the Features (음성의 안정적 변수 추출 및 변수의 의미 연구)

  • Kim, Keun-Ho;Kim, Sang-Gil;Kang, Nam-Sik;Kim, Jong-Yeol
    • Korean Journal of Oriental Medicine
    • /
    • v.17 no.3
    • /
    • pp.97-104
    • /
    • 2011
  • Objectives : In this paper, we proposed a method for selecting reliable variables from various vocal features such as frequency derivative features, frequency band ratios, intensities of 5 vowels and an intensity of a sentence, since some features are sensitive to the variation of a subject's utterance. Methods : To obtain the reliable voice variables, the coefficient of variation (CV) was used as the index to evaluate the level of reliability. Since the distributions of a few features are not Gaussian, but are instead skewed to the right or left, we transformed the features by taking the log or square root. Moreover, the definition of the variables that are suitable to represent the vocal property was explained and analyzed. Results : At first, we recorded the vowels and the sentence five times both in the morning and afternoon of the same day, totally ten recordings from each of six subjects (three males and three females). We then analyzed the CVs of each subject's voice to obtain the stable features with a sufficient repeatability. The features having less than 20% CVs for all six subjects were selected. As a result, 92 stable variables from the 222 features were extracted, which included all the transformed variables. Conclusions : Voice can be widely used to classify the four constitution types and to recognize one's health condition from extracting meaningful features as physical quantity in traditional Korean medicine or Western medicine. Therefore, stable voice variables can be useful in the u-Healthcare system of personalized medicine and for improving diagnostic accuracy.

Case of Adductor Spasmodic Dysphonia Patient Complaining of Voice Tremor and Hoarseness Treated with Combined Korean Medical Therapies (음성 떨림과 애성을 호소하는 내전형 연축성 발성장애 환자에 대한 복합 한의치험 1례)

  • Seong-Wook Lee;So-Min Jung;Han-Gyul Lee;Ki-Ho Cho;Sang-Kwan Moon;Woo-Sang Jung;Seungwon Kwon
    • The Journal of Internal Korean Medicine
    • /
    • v.44 no.2
    • /
    • pp.158-166
    • /
    • 2023
  • Background: Adductor spasmodic dysphonia (ASD) is caused by the involuntary contraction of laryngeal muscles due to dystonia localized to the larynx. In the case of ASD, conventional treatment is mainly performed with a botulinum toxin injection. However, the botulinum toxin injection has a short-lasting effect and requires repeated injections. Alternatives are needed due to concerns over adverse effects, such as general weakness and airway aspiration caused by the botulinum toxin injection. Case report: A 46-year-old female patient with ASD complained of voice tremor and hoarseness. The combined Korean medical treatments-Ukgan-san-gami, Jakyakgamcho-tang, acupuncture, and transcutaneous electrical nerve stimulation (TENS)-were administered on the first day the patient was hospitalized. The Voice Handicap Index (VHI) was evaluated during the treatment. The VHI taken on the second day totaled 92 points. On the ninth day, 81 points were recorded. Total score gradually improved, and on the 16th day, 62 points were recorded. Combined Korean medical treatment lasted 19 days. Conclusion: The present case report suggests that a combined Korean medical treatment approach with Ukgan-san-gami, Jakyakgamcho-tang, acupuncture, and TENS might be effective for symptoms such as voice tremors and hoarseness. Combined Korean medical treatment can be a therapeutic option for patients with ASD.

The Phonetic Characteristics and Voice Handicap Index in Allergic Rhinitis Patients (알레르기성 비염 환자들의 음향음성학적 특성 및 음성장애지수)

  • Kim, Seong-Tae;Choi, Seung-Ho;Roh, Jong-Lyel;Lee, Bong-Jae;Shim, Mi-Ran;Kim, Sang-Yoon;Nam, Soon-Yuhl
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.1
    • /
    • pp.39-43
    • /
    • 2007
  • Background and Objectives: There are few studies reported that specifically examine the phonetic characteristics and voice handicap index (VHI) in patients with Allergic Rhinitis. This study was designed to examine phonetic characteristics and VHI in adult patients with allergic rhinitis. Materials and Methods: Forty-two male patients diagnosed as allergic rhinitis were given skin-prick test and others, aged from 20 to 56 years, and were compared with a 16 male control group with no pathology and in the same age group. The VHI was used to measure the changes of patient's perception. Acoustic and aerodynamic analysis test were done, and a nasalance test performed to measure rabbit, baby, and mother passage. Acoustic rhionometry (AR) was performed to evaluate nasal volume and nasal crosssectional area. Statistical analysis was done using independent sample t-test. Results: VHI showed significantly different score in the studied group, higher than that of control group. AR graph showed that there was no significant differences of nasal volume and nasal cross-sectional area. The Shimmer and SFF value in the group of allergic patients were higher than in the control group. MPT value in the group of allergic patients was lower than in the control group. Nasalance in allergic patients showed hypernasality all passage. Conclusion: We suggest that patients with allergic rhinitis have considerable voice problems. Most of them have hypernasality, which may be a compensatory mechanism by nasal obstruction.

  • PDF