• Title/Summary/Keyword: voice parameter

Search Result 179, Processing Time 0.024 seconds

Speech Intelligibility and Vowel Space Characteristics of Alaryngeal Speech (무후두음성의 말 명료도와 모음 공간 특성)

  • Shim, Hee-Jeong;Jang, Hyo-Ryung;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.17-24
    • /
    • 2013
  • This study is aimed at finding out different types of speech characteristics categorized based on voice rehabilitation techniques used on twenty-six patients (all-male) with total or partial laryngectomees. The speech intelligibility of standard esophageal (SE), tracheoesophageal speech (TE), and electriclarynx (EL) was measured by using the CSL and eleven listeners were instructed to rate the speech on a 5-point scale. The vowel space parameters such as vowel space, VAI, FCR, and F2 ratio were measured by averaging 5 repeats of each vowel (/a/, /e/, /i/, /u/) and the results were put into the parameter formula. The results showed significant statistical differences in speech intelligibility and vowel space between SE and TE. The speech intelligibility and vowel space of TE were higher than those of SE or EL and there was a high correlation between speech intelligibility and some parameters (vowel space, VAI, F2 ratio). The results also showed that TE's speech characteristics were most similar to normal groups comparing with SE and EL, but still very deviant in laryngeal speech. This was due to insufficient airflow intake into the esophagus when producing sounds, and because articulation movement was carried out differently among groups. Therefore, these findings will contribute to establishing a baseline related to speech characteristics in voice rehabilitation for patients with alaryngeal speech.

Implementation of a Robust Speech Recognizer in Noisy Car Environment Using a DSP (DSP를 이용한 자동차 소음에 강인한 음성인식기 구현)

  • Chung, Ik-Joo
    • Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.67-77
    • /
    • 2008
  • In this paper, we implemented a robust speech recognizer using the TMS320VC33 DSP. For this implementation, we had built speech and noise database suitable for the recognizer using spectral subtraction method for noise removal. The recognizer has an explicit structure in aspect that a speech signal is enhanced through spectral subtraction before endpoints detection and feature extraction. This helps make the operation of the recognizer clear and build HMM models which give minimum model-mismatch. Since the recognizer was developed for the purpose of controlling car facilities and voice dialing, it has two recognition engines, speaker independent one for controlling car facilities and speaker dependent one for voice dialing. We adopted a conventional DTW algorithm for the latter and a continuous HMM for the former. Though various off-line recognition test, we made a selection of optimal conditions of several recognition parameters for a resource-limited embedded recognizer, which led to HMM models of the three mixtures per state. The car noise added speech database is enhanced using spectral subtraction before HMM parameter estimation for reducing model-mismatch caused by nonlinear distortion from spectral subtraction. The hardware module developed includes a microcontroller for host interface which processes the protocol between the DSP and a host.

  • PDF

Performance Evaluation of IDS on MANET under Grayhole Attack (그레이홀 공격이 있는 MANET에서 IDS 성능 분석)

  • Kim, Young-Dong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.11 no.11
    • /
    • pp.1077-1082
    • /
    • 2016
  • IDS can be used as a countermeasure for malicious attacks which cause degrade of network transmission performance by disturbing of MANET routing function. In this paper, effects of IDS for transmission performance on MANET under grayhole attacks which has intrusion objects for a part of transmissions packets, some suggestion for effective IDS will be considered. Computer simulation based on NS-2 is used for performance analysis, performance is measured with VoIP(: Voice over Internet Protocol) as an application service. MOS(: Mean Opinion Score), CCR(: Call Connection Rate) and end-to-end delay is used for performance parameter as standard transmission quality factor for voice transmission.

Quality of Service Assurance Model for AMR Voice Traffic in Downlink WCDMA System (순방향 WCDMA 채널에서 AMR 음성 트래픽의 품질 보증 모델)

  • Jung, Sung Hwan;Hong, Jung Wan;Lie, Chang Hoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.33 no.2
    • /
    • pp.191-200
    • /
    • 2007
  • We propose the QoS (Quality of Service) assurance model for AMR (Adaptive MultiRate) voice users considering the capacity and service quality jointly in downlink WCDMA system. For this purpose, we introduce a new system performance measure and the number-based AMR mode allocation scheme. The proposed number-based AMR mode allocation can be operated only with the information of total number of ongoing users. Therefore, it can be more simply implemented than the existing power-based allocation. The proposed system performance measure considers the stochastic variations of AMR modes of ongoing users and can be analytically obtained using CTMC (Continuous Time Markov Chain) modeling. In order to validate the proposed analytical model, a discrete event-based simulation model is also developed. The performance measure obtained from the analytical model is in agreement with the simulation results and is expected to be useful for parameter optimization.

Computation of Laryngeal Flow and Sound through a Dynamic Model of the Vocal Folds (동적 성대 모델을 이용한 후두 내 유동 및 음향장에 대한 수치 연구)

  • Bae, Young-Min;Moon, Young-J.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2008.03b
    • /
    • pp.21-24
    • /
    • 2008
  • The present study numerically investigates the glottal airflow characteristics as well as acoustic features of phonation fully coupled with dynamic behavior of vocal folds. The vocal folds are described by a low-dimensional body-covered model characterized by bio-mechanical parameters such as glottal width, vocal folds stiffness, and subglottal pressure. The flow in the vocal tract is modeled as an incompressible, axisymmetric form of the Navier-Stokes equations (INS), while the acoustic field is predicted by the linearized perturbed compressible equations (LPCE). The computed result shows that a two-mass model of vocal folds is sufficient to reproduce temporal variations in oral airflow and glottis motion produced by female speakers. It is also found that i) the glottal width has a significant effect on the amplitude of glottal flow, and thus on the amplitude of acoustic wave in the vocal tract, ii) the vocal fold tension is the main control parameter for the fundamental frequency of phonation, iii) the subglottal pressure plays an appreciable role on reproduction of the self-sustained oscillation of vocal folds, and iv) the strength of pulsating airflow and vortical structures are primarily affected by glottal width and subglottal pressure, and are closely related to pitch, loudness, and voice quality. Finally, more comprehensive explanation about the difference between one- and two-mass models is presented with discussion of effectiveness of vocal folds oscillation and voice quality.

  • PDF

A Study on the Acoustical Characteristics of Pistol Impluse and MLS Source Measurements in Room Types (음향측정시 실의 종류와 음원에 따르는 음향인자 측정분석에 관한 연구)

  • Kim, Jeong-Jung;Son, Jang-Ryeol
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2004.11a
    • /
    • pp.1028-1031
    • /
    • 2004
  • Last target of architectural acoustics is that people wish to convey voice effectively from the space adaptively in use purpose in building. But, to how exactly through space sound source that wish to deliver from indoor can be passed does quantification sound estimation method is proposing various kinds physical parameter to estimate degree of voice definition (Speech articulation) and reverberation. Result that evaluate sound source about MLS signal and Impluse signal by pistol in this treatise could know that converge in MLS and measurement error extent about reverberation time(RT) of sound benevolent person. And value is thought there is problem showing change irregularly about sound benevolent person of D50, C80 etc. Finally, in case is spread sound field in difference of sound pressure level, when measure about change of sound pressure level, sound benevolent person could know that there is no different effect.

  • PDF

The Effect of Cardiac and Voice of Tomatoes Based on the Voice Analysis Parameter (음성 분석요소를 기반으로 한 토마토가 심장과 목소리에 미치는 영향)

  • Kim, Bong-Hyun;Lim, Sung-Su;Lim, Soon-Yong;Yoo, Hwang-Jun;Yeon, Yong-Heum;Min, Ji-Seon;Han, Sang-Hyo;Ka, Min-Kyoung;Cho, Dong-Uk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.11a
    • /
    • pp.1039-1042
    • /
    • 2011
  • 토마토는 주위에서 흔하게 접할 수 있는 채소 중 하나이며 여러 음식의 재료로 사용된다. 토마토는 활성 산소의 생성을 억제하고 항암 효과가 좋으며, 콜레스테롤 수치를 낮추어 주어 혈관을 강화해 준다. 또한, 인체 장기 중 심장은 혈액을 순환시키는 원동력이며 순환계의 중추 역할을 한다. 따라서 본 논문에서는 대체 및 예방의학에 이용되는 중요한 요소 중 하나인 음성을 이용하여 토마토를 섭취하였을 때 음성 분석 요소인 Jitter와 2Formants Bandswidth의 변화를 분석하였다.

Acoustic Outcomes After Laryngomicrosurgery for Reinke's Edema (라인케 부종에서 후두미세수술 후의 음성 결과)

  • Kim, Min Song;Song, Chang Myeon;Kim, Keon Ho;Jung, Seon Min;Ji, Yong Bae;Tae, Kyung
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.28 no.2
    • /
    • pp.96-99
    • /
    • 2017
  • Background and Objectives : The management of Reinke's edema includes usually medical treatment and voice therapy. Laryngomicrosurgery (LMS) is also necessary, especially to improve airway obstruction. However, voice outcome after LMS has not been determined well. The aim of this study was to evaluate effectiveness of LMS for Reinke's edema and analyze the voice outcomes after LMS. Materials and Methods : Twenty-five patients with Reinke's edema who underwent LMS from September 2007 to December 2016 were enrolled in this study. We analyzed reflux finding score (RFS), reflux symptom index (RSI), and acoustic parameters before and after surgery. Results : Male was 15 (60%) and female was 10 (40%), and mean age was 49.6 years. Preoperative mean value of RFS decreased significantly up to 3 months after LMS ($18.3{\pm}2.2$ and $10.0{\pm}2.2$ at preoperative and 3 months postoperatively, respectively). The mean value of Jitter decreased significantly before and after surgery ($2.71{\pm}2.81%$ and $1.06{\pm}1.21%$ before and after LMS, p=0.041). The mean value of Shimmer also decreased significantly before and after surgery ($7.97{\pm}3.63%$ and $4.83{\pm}1.85%$, respectively, p=0.006). Conclusion : LMS is effective in the treatment of Reinke's edema. It results in favorable acoustic outcomes and laryngoscopic findings in properly selected patients.

  • PDF

Voice Personality Transformation Using a Probabilistic Method (확률적 방법을 이용한 음성 개성 변환)

  • Lee Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.3
    • /
    • pp.150-159
    • /
    • 2005
  • This paper addresses a voice personality transformation algorithm which makes one person's voices sound as if another person's voices. In the proposed method, one person's voices are represented by LPC cepstrum, pitch period and speaking rate, the appropriate transformation rules for each Parameter are constructed. The Gaussian Mixture Model (GMM) is used to model one speaker's LPC cepstrums and conditional probability is used to model the relationship between two speaker's LPC cepstrums. To obtain the parameters representing each probabilistic model. a Maximum Likelihood (ML) estimation method is employed. The transformed LPC cepstrums are obtained by using a Minimum Mean Square Error (MMSE) criterion. Pitch period and speaking rate are used as the parameters for prosody transformation, which is implemented by using the ratio of the average values. The proposed method reveals the superior performance to the previous VQ-based method in subjective measures including average cepstrum distance reduction ratio and likelihood increasing ratio. In subjective test. we obtained almost the same correct identification ratio as the previous method and we also confirmed that high qualify transformed speech is obtained, which is due to the smoothly evolving spectral contours over time.

Analysis of Vocal Cord Function by Humidity Change Based on Voice Signal Analysis (음성신호 분석 기반의 습도 변화에 따른 성대 기능 분석)

  • Kim, Bong-Hyun;Cho, Dong-Uk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37A no.9
    • /
    • pp.792-798
    • /
    • 2012
  • Network Quotient, an important figure in modern society, the intelligibility of speech as a conversation partner to maximize pulling up feeling of liking it as much as possible has become an important issue. The humidity of air in the intelligibility of speech have many influences. Therefore, in this paper, we carried out experiment to apply voice signal analysis techniques which to analyze influenced vocal cords in 30%, 50% and 80%, maintaining a constant humidity of the environment. With this in mind, we carried out experiments on intensity and pitch of voice signal on twenty male 20s in maintaining a constant humidity 30%, 50% and 80% of humidity. Finally, we carried out study to draw a significance through statistical analysis measuring characteristic parameter of vocal cord function to change of humidity.