• Title/Summary/Keyword: voice quality parameters

Search Result 96, Processing Time 0.029 seconds

A Study on the Usability Evaluation of Earcon Applied to Voice Menu (이어콘을 적용한 음성 메뉴의 사용성 평가에 관한 연구)

  • Lim, Chee-Hwan;Lee, Jae-In;Lee, Sung-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.28 no.4
    • /
    • pp.55-62
    • /
    • 2005
  • This paper describes the experiment that investigated the possibility of design and evaluation of the usability of earcon applied to voice menu. The earcon has functions in providing navigational cue in the hierarchical menu, and also can be applied to voice menu for improving the recall rate and the response time. In this experiment, participants identified their location with the help of earcon applied to the voice menu with the various earcon parameters. In detail some participants listened to the good quality sound of voice menu using the structured earcon, and they recalled the location they heard. Other participants listened to the good quality sound of voice menu without earcon, and they recalled the location they heard in the same manner. And the response times were checked through their answers. On analyzing the results, we found the earcon applied to voice menu showed the increase of recalling rate from what they heard during experiment. That is the performance of task was better with earcon applied to voice menu than with voice menu without earcon. In the earcon applied to voice menu test, it showed the accuracy of 92.50%, but in voice menu without earcon test, people could only recall 66.25% among given questions. The response time was reduced from 4.98 sec to 3.85 sec. In addition, this experiment showed the 87.5% of participants preferred the earcon applied to voice menu.

Voice Analysis and Treatment Result According to Configuration of Sulcus Vocalis (성대구증의 형태에 따른 음향학적 분석 및 치료 결과)

  • Yang, Ho Cherl;Jeong, Byoung Seo;Kim, Dong Young;Woo, Joo Hyun
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.23 no.2
    • /
    • pp.119-123
    • /
    • 2012
  • Background and Objectives : Sulcus vocalis could be classified into type I, type IIa, and type IIb. There have been a little reports about voice quality and treatment results related with types of sulcus vocalis. The authors conducted an analysis of voice and treatment according to different types of sulcus vocalis. Materials and Methods : This study was based on a retrospective chart review. The sulcus types were classified into type I and type II. Objective and subjective voice assessments were analyzed. Patients were treated individually with voice therapy, percutaneous steroid injection, and injection laryngoplasty. Comparison was performed on the voice difference between type I group and type II group, and between pre-treatment and post-treatment of each types. Results : One hundred and one patients were enrolled into this study, and 49 patients were type I and 52 patients were type II. Type I group showed longer mean maximal phonation time (MPT) than type II group, although other voice parameters didn't show any difference between two groups. Even after the management, almost all of the voice parameters didn't show improvement except MPT of type II group. Conclusion:Although the type I sulcus has been known as a non-pathologic lesion, it can result in some degree of voice change and discomfort, and thus need an active management. In this study, voice therapy, percutaneous steroid injection, and injection laryngoplasty showed limited effect to the both types of sulcus vocalis. Further studies for management of sulcus vocalis were needed.

  • PDF

Effects of Lax Vox voice therapy in a patient with spasmodic dysphonia: A case report (연축성 발성장애 환자의 Lax Vox 음성치료 효과)

  • Lim, Hye Jin;Choi, Seong Hee;Kim, Jeong Kyu;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.57-63
    • /
    • 2016
  • Recently, the Lax Vox voice therapy has been used as one of the SOVTE(Semi-Occluded Vocal Tracts Exercise). The purpose of this study was to explore the effect of Lax Vox voice therapy for a patient with Spasmodic dysphonia on voice improvement. One female spasmodic dysphonia patient(age=27) who had been diagnosed by a laryngologist received Lax Vox voice therapy. The Lax Vox protocol was configured as 5 steps (1 warm-up and 4 steps : bubbling without / with phonation/ gliding with phonation/ generalization) in this study. A total of 11 sessions were performed by a certified speech language pathologist. The present study evaluated the acoustic, aerodynamic, auditory perceptual, and patient's self-rating between pre-, mid-, and post- voice therapy. All objective and subjective parameters were improved after voice therapy; Reduced frequency variation, increased maximum phonation time, enlarged voice range, improved 'G' and 'S' in GRBAS & USDRS, and reduced VHI were observed. Especially, decreased $f_0$ and remarkably reduced voice tremor were also demonstrated following Lax Vox voice therapy. Accordingly, Lax Vox voice therapy technique can be useful for improving voice and quality of life in patients with spasmodic dysphonia.

The Effect of Noise on the Normal and Pathological Voice (소음환경이 정상 및 병적음성에 미치는 영향)

  • Hong, Ki-Hwan;Yang, Yoon-Soo;Kim, Hyun-Gi
    • Speech Sciences
    • /
    • v.9 no.4
    • /
    • pp.27-38
    • /
    • 2002
  • The purpose of this article is to present the acoustic parameters (VOT, jitter, shimmer, vF0, vAm, NHR, SPI, VTI, DVB, DSH) for consonants (/pipi/, /$p^{h}ip^{h}i$/, /p'ip'i/) and sustained vowels (/a/, /e/, /i/) produced by normal subjects and dysphonia patients at two vocal effort(normal, high) by Lombard effect using 60dB white noise. Lombard effect indicates the vocal effort increase in noisy situation. At normal vocal effort, in general the acoustic parameter values of patients are greater than normal. And in noisy situation, significant decrease of acoustic values is seen in normal compared with in dysphonia patients. The clinical implication of this finding, the vocal quality in dysphonia is not compensated by vocal effort as well as normal subjects because of the inefficiency caused by abnormal vocal fold appearance and function. And with this result, we can counsel that the voice quality can not be improved as well as the patient expect.

  • PDF

Performance Comparison of AMR Codec Mode Allocations in Downlink WCDMA System (순방향 WCDMA 채널에서 AMR 음성 코덱 모드 할당방식에 대한 성능 비교)

  • Jeong, S.H.;Hong, J.W.;Lee, S.C.;Lie, C.H.
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.31 no.4
    • /
    • pp.349-357
    • /
    • 2005
  • The Adaptive Multi-Rate (AMR) speech codec is the mandatory for voice service in WCDMA systems. The AMR codec can be used efficiently to provide a balanced trade-off between the capacity and quality of voice by adjusting various service rates. In this paper, three ways of AMR mode allocation schemes on the downlink in WCDMA system are evaluated. To evaluate users satisfaction efficiently, new system performance measure and analytic models are proposed. The proposed analytic models can be applied to obtain optimal mode allocation ways while considering the system capacity and quality of voice. In numerical examples, the ways of finding optimal parameters are illustrated for the given traffic loads and the performances of three mode allocation schemes are compared.

Spectral and Cepstral Analyses of Esophageal Speakers (식도발성화자 음성의 spectral & cepstral 분석)

  • Shim, Hee-Jeong;Jang, Hyo-Ryung;Shin, Hee-Baek;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.47-54
    • /
    • 2014
  • The purpose of this study was to analyze spectral versus cepstral measurements in esophageal speakers. The comparison between the measurements in thirteen male esophageal speakers was compared with the control group of thirteen normal speakers using the sustained vowel /a/. The main results can be summarized as below: (a) the CPP and L/H ratio of the esophageal group were significantly lower than those of the control group (b) the CPP was significantly correlated with the spectral parameters such as jitter, shimmer, NHR and VTI, and (c) the ROC analysis showed that the threshold of 10.25dB for the CPP achieved a good classification for esophageal speakers, with 100% perfect sensitivity and specificity. Thus, it was known that cepstral-based acoustic measures such as CPP, may be more reliable predictors than other spectral-based acoustic measures such as jitter and shimmer. And it was found that cepstral-based acoustic measures were effective in distinguishing esophageal voice quality from normal voice quality. This research will contribute to establishing a baseline related to speech characteristics in voice rehabilitation with laryngectomees.

Quantitative Analysis of Voice Quality after Radiation Therapy for Stage T1a Glottic Carcinoma (T1a 병기 성문암의 방사선 치료 후 음성에 관한 연구)

  • Lee Joon-Kyoo;Chung Woong-Gi
    • Radiation Oncology Journal
    • /
    • v.23 no.1
    • /
    • pp.17-21
    • /
    • 2005
  • Purpose : To evaluate the voices of irradiated patients with early glottic carcinoma and to compare these with the voices of healthy volunteers. Materials and Methods : The voice samples (sustained vowel) of seventeen male patients who had been irradiated for T1a glottic squamous carcinoma at least 1 year prior to the study were analyzed with objective voice analyzer (acoustic voice analysis, aerodynamic test, and videostroboscopic analysis) and compared with those of a normal group of twenty age- and sex-matched volunteers. Average fundamental frequency, jitter, shimmer, and noise-to-harmonic ratio were obtained for acoustic voice analysis. Maximal phonation time, mean flow rate, intensity, subglottic pressure, glottal resistance, glottal efficiency, and glottal power were obtained for aerodynamic test. Results : The irradiated group presented higher values of shimmer in acoustic voice analysis. There was no significant difference between two groups in other parameters. Conclusion : In this study all the objective voice parameters except shimmer were no4 significantly different between the irradiated group and the control group. These results suggest that the voice quality is minimally affected by radiation therapy for 71 a glottic carcinoma.

Performance Comparison of Automatic Detection of Laryngeal Diseases by Voice (후두질환 음성의 자동 식별 성능 비교)

  • Kang Hyun Min;Kim Soo Mi;Kim Yoo Shin;Kim Hyung Soon;Jo Cheol-Woo;Yang Byunggon;Wang Soo-Geun
    • MALSORI
    • /
    • no.45
    • /
    • pp.35-45
    • /
    • 2003
  • Laryngeal diseases cause significant changes in the quality of speech production. Automatic detection of laryngeal diseases by voice is attractive because of its nonintrusive nature. In this paper, we apply speech recognition techniques to detection of laryngeal cancer, and investigate which feature parameters and classification methods are appropriate for this purpose. Linear Predictive Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficients (MFCC) are examined as feature parameters, and parameters reflecting the periodicity of speech and its perturbation are also considered. As for classifier, multilayer perceptron neural networks and Gaussian Mixture Models (GMM) are employed. According to our experiments, higher order LPCC with the periodic information parameters yields the best performance.

  • PDF

The Utility of Perturbation, Non-linear dynamic, and Cepstrum measures of dysphonia according to Signal Typing (음성 신호 분류에 따른 장애 음성의 변동률 분석, 비선형 동적 분석, 캡스트럼 분석의 유용성)

  • Choi, Seong Hee;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.63-72
    • /
    • 2014
  • The current study assessed the utility of acoustic analyses the most commonly used in routine clinical voice assessment including perturbation, nonlinear dynamic analysis, and Spectral/Cepstrum analysis based on signal typing of dysphonic voices and investigated their applicability of clinical acoustic analysis methods. A total of 70 dysphonic voice samples were classified with signal typing using narrowband spectrogram. Traditional parameters of %jitter, %shimmer, and signal-to-noise ratio were calculated for the signals using TF32 and correlation dimension(D2) of nonlinear dynamic parameter and spectral/cepstral measures including mean CPP, CPP_sd, CPPf0, CPPf0_sd, L/H ratio, and L/H ratio_sd were also calculated with ADSV(Analysis of Dysphonia in Speech and VoiceTM). Auditory perceptual analysis was performed by two blinded speech-language pathologists with GRBAS. The results showed that nearly periodic Type 1 signals were all functional dysphonia and Type 4 signals were comprised of neurogenic and organic voice disorders. Only Type 1 voice signals were reliable for perturbation analysis in this study. Significant signal typing-related differences were found in all acoustic and auditory-perceptual measures. SNR, CPP, L/H ratio values for Type 4 were significantly lower than those of other voice signals and significant higher %jitter, %shimmer were observed in Type 4 voice signals(p<.001). Additionally, with increase of signal type, D2 values significantly increased and more complex and nonlinear patterns were represented. Nevertheless, voice signals with highly noise component associated with breathiness were not able to obtain D2. In particular, CPP, was highly sensitive with voice quality 'G', 'R', 'B' than any other acoustic measures. Thus, Spectral and cepstral analyses may be applied for more severe dysphonic voices such as Type 4 signals and CPP can be more accurate and predictive acoustic marker in measuring voice quality and severity in dysphonia.

Voice conversion using low dimensional vector mapping (낮은 차원의 벡터 변환을 통한 음성 변환)

  • Lee, Kee-Seung;Doh, Won;Youn, Dae-Hee
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.4
    • /
    • pp.118-127
    • /
    • 1998
  • In this paper, we propose a voice personality transformation method which makes one person's voice sound like another person's voice. In order to transform the voice personality, vocal tract transfer function is used as a transformation parameter. Comparing with previous methods, the proposed method can obtain high-quality transformed speech with low computational complexity. Conversion between the vocal tract transfer functions is implemented by a linear mapping based on soft clustering. In this process, mean LPC cepstrum coefficients and mean removed LPC cepstrum modeled by the low dimensional vector are used as transformation parameters. To evaluate the performance of the proposed method, mapping rules are generated from 61 Korean words uttered by two male and one female speakers. These rules are then applied to 9 sentences uttered by the same persons, and objective evaluation and subjective listening tests for the transformed speech are performed.

  • PDF