• Title/Summary/Keyword: Speech Analysis

Search Result 1,585, Processing Time 0.035 seconds

Acoustic Characteristics of 'Short Rushes of Speech' using Alternate Motion Rates in Patients with Parkinson's Disease (파킨슨병 환자의 교대운동속도 과제에서 관찰된 '말 뭉침'의 음향학적 특성)

  • Kim, Sun Woo;Yoon, Ji Hye;Lee, Seung Jin
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.55-62
    • /
    • 2015
  • It is widely accepted that Parkinson's disease(PD) is the most common cause of hypokinetic dysarthria, and its characteristics of 'short rushes of speech' have become more evident along with the severity of motor disorders. Speech alternate motion rates (AMRs) are particularly useful for observing not only rate abnormalities but also deviant speech. However, relatively little is known about the characteristics of 'short rushes of speech' in terms of AMRs of PD except for the perceptual characteristics. The purpose of this study was to examine which acoustic features of 'short rushes of speech' in terms of AMRs are a robust indicator of Parkinsonian speech. Numbers of syllabic repetitions (/pə/, /tə/, /kə/) in AMR tasks were analyzed through acoustic methods observing a spectrogram of the Computerized Speech Lab in 9 patients with PD. Acoustically, we found three characteristics of 'short rushes of speech': 1) Vocalized consonants without closure duration(VC) 76.3%; 2) No consonant segmentation(NC) 18.6%; 3) No vowel formant frequency(NV) 5.1%. Based on these results, 'short rushes of speech' may affect the failure to reach and maintain the phonatory targets. In order to best achieve the therapeutic goals, and to make the treatment most efficacious, it is important to incorporate training methods which are based on both phonation and articulation.

Design and Implementation of Korean Tet-to-Speech System (다이폰을 이용한 한국어 문자-음성 변환 시스템의 설계 및 구현)

  • 정준구
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.91-94
    • /
    • 1994
  • This paper is a study on the design and implementation of the Korean Tet-to-Speech system. In this paper, parameter symthesis method is chosen for speech symthesis method and PARCOR coeffient, one of the LPC analysis, is used as acoustic parameter, We use a diphone as synthesis unit, it include a basic naturalness of human speech. Diphone DB is consisted of 1228 PCM files. LPC synthesis method has defect that decline clearness of synthesis speech, during synthesizing unvoiced sound In this paper, we improve clearness of synthesized speech, using residual signal as ecitation signal of unvoiced sound. Besides, to improve a naturalness, we control the prosody of synthesized speech through controlling the energy and pitch pattern. Synthesis system is implemented at PC/486 and use a 70Hz-4.5KHz band pass filter for speech imput/output, amplifier and TMS320c30 DSP board.

  • PDF

Speech Intelligibility Analysis on the Vibration Sound of the Window Glass of a Conference Room (회의실 유리창 진동음의 명료도 분석)

  • Kim, Yoon-Ho;Kim, Hee-Dong;Kim, Seock-Hyun
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2006.11a
    • /
    • pp.150-155
    • /
    • 2006
  • Speech intelligibility is investigated on a conference room-window glass coupled system. Using MLS(Maximum Length Sequency) signal as a sound source, acceleration and velocity responses of the window glass are measured by accelerometer and laser doppler vibrometer. MTF(Modulation Transfer Function) is used to identify the speech transmission characteristics of the room and window system. STI(Speech Transmission Index) is calculated by using MTF and speech intelligibility of the room and the window glass is estimated. Speech intelligibilities by the acceleration signal and the velocity signal are compared and the possibility of the wiretapping is investigated. Finally, intelligibility of the conversation sound is examined by the subjective test.

  • PDF

Speech Intelligibility Analysis on the Vibration Sound of the Glass Window of a Conference Room (회의실 유리창 진동음의 음성 명료도 분석)

  • Kim, Hee-Dong;Kim, Yoon-Ho;Kim, Seock-Hyun
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.17 no.4 s.121
    • /
    • pp.363-369
    • /
    • 2007
  • The purpose of the study is to obtain acoustical information to prevent eavesdropping of the glass window. Speech intelligibility was investigated on the vibration sound detected from the glass window of a conference room. Objective test using speech transmission index(STI) was performed to estimate quantitatively the speech intelligibility. STI was determined based on tile modulation transfer function(MTF) of the room-glass window system. Using Maximum Length Sequency(MLS) signal as a sound source, impulse responses of the glass window and MTF were determined by signals from accelerometers and laser doppler vibrometer. Finally, speech intelligibility of the interior sound and window vibration were compared under different sound pressure levels and amplifier gains to confirm the effect of measurement condition on the speech intelligibility.

Speech training aids for deafs (청각 장애자용 발음 훈련 기기의 개발)

  • 김동준;윤태성;박상희
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1991.10a
    • /
    • pp.746-751
    • /
    • 1991
  • Deafs train articulation by observing mouth of a tutor. sensing tactually the notions of the vocal organs, or using speech training aids. Present speech training aids for deafs can measure only single speech ter, or display only frequency spectra in histogrm or pseudo-color. In this study, a speech training aids that can display subject's articulation in the form of a cross section of the vocal organs and other speech parameters together in a single system Is aimed to develop and this system makes a subject to know where to correct. For our objective, first, speech production mechanism is assumed to be AR model in order to estimate articulatory notions of the vocal tract from speech signal. Next, a vocal tract profile mode using LPC analysis is made up. And using this model, articulatory notions for Korean vowels are estimated and displayed in the vocal tract profile graphics.

  • PDF

Distorted Speech Rejection For Automatic Speech Recognition under CDMA Wireless Communication (CDMA이동통신환경에서의 음성인식을 위한 왜곡음성신호 거부방법)

  • Kim Nam Soo;Chang Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.8
    • /
    • pp.597-601
    • /
    • 2004
  • This paper introduces a pre-rejection technique for wireless channel distorted speech with application to automatic speech recognition (ASR) Based on analysis of distorted speech signals over a wireless communication channel. we propose a method to reject the channel distorted speech with a small computational load. From a number of simulation results. we can discover that tile pre-rejection algorithm enhances the robustness of speech recognition operation.

A Comparative Study on the Voices of Clergymen: Ministers vs. Priests (성직자 음성의 음향학적인 비교 연구)

  • Lee, Eun-Seon;Park, Sang-Hee;Jo, Sung-Mi;Jeong, Ok-Ran;Seok, Dong-Il
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.79-86
    • /
    • 2003
  • This study compared the voices of ministers and priests. There. has been a common notion that ministers is more passionate than priests in delivering their speech. Therefore, it can be assumed that ministers abuses or misuses his/her voice compared to priests. This study attempted acoustic analysis of the voices of 6 ministers and .5 priests before and after their speech. We measured F0, jitter, shimmer, NNE and HNR using Dr. Speech (Version 4.0, Tiger DRS). A t-test was performed to determine any objective differences of their voices. The results showed that there were no significant differences in the voices of ministers and priests before and after their speech. However, there seemed to be an interesting reversed tendency between ministers and priests, although it did not reach a statistical significance. That is, P0 tended to increase after the speech in ministers, whereas it tended to decrease in priests. In addition, HNR tended to decrease after the speech in priests, while it tended to increase in ministers.

  • PDF

Detection of Glottal Closure Instant for Voiced Speech Using Wavelet Transform (웨이브렛 변환을 이용한 음성신호의 성문폐쇄시점 검출)

  • Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.153-165
    • /
    • 2000
  • During the phonation of voiced sounds, instants exist where the glottis is opened or closed, due to the periodic vibration of the vocal cord. When closed, this is called the glottal closure instant(GCI) or epoch.. The correct detection of the GCI is one of the important problems in speech processing for pitch detection, pitch synchronous analysis, and so on. Recently, it has been shown that the local maxima points of the wavelet transformed speech signal correspond to the GCIs of speech signal. In this paper, we investigate the accuracy of Gels estimated from this wavelet transformed speech signal. For this purpose we compare them with the negative peak points of the differentiated EGG signal that represents the actual GCIs of speech signal.

  • PDF

CASA-based Front-end Using Two-channel Speech for the Performance Improvement of Speech Recognition in Noisy Environments (잡음환경에서의 음성인식 성능 향상을 위한 이중채널 음성의 CASA 기반 전처리 방법)

  • Park, Ji-Hun;Yoon, Jae-Sam;Kim, Hong-Kook
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.289-290
    • /
    • 2007
  • In order to improve the performance of a speech recognition system in the presence of noise, we propose a noise robust front-end using two-channel speech signals by separating speech from noise based on the computational auditory scene analysis (CASA). The main cues for the separation are interaural time difference (ITD) and interaural level difference (ILD) between two-channel signal. As a result, we can extract 39 cepstral coefficients are extracted from separated speech components. It is shown from speech recognition experiments that proposed front-end has outperforms the ETSI front-end with single-channel speech.

  • PDF

A study on the recognition performance of connected digit telephone speech for MFCC feature parameters obtained from the filter bank adapted to training speech database (훈련음성 데이터에 적응시킨 필터뱅크 기반의 MFCC 특징파라미터를 이용한 전화음성 연속숫자음의 인식성능 향상에 관한 연구)

  • Jung Sung Yun;Kim Min Sung;Son Jong Mok;Bae Keun Sung;Kang Jeom Ja
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.119-122
    • /
    • 2003
  • In general, triangular shape filters are used in the filter bank when we get the MFCCs from the spectrum of speech signal. In [1], a new feature extraction approach is proposed, which uses specific filter shapes in the filter bank that are obtained from the spectrum of training speech data. In this approach, principal component analysis technique is applied to the spectrum of the training data to get the filter coefficients. In this paper, we carry out speech recognition experiments, using the new approach given in [1], for a large amount of telephone speech data, that is, the telephone speech database of Korean connected digit released by SITEC. Experimental results are discussed with our findings.

  • PDF