• Title/Summary/Keyword: Speech rate

Search Result 1,246, Processing Time 0.023 seconds

The influences of speech rate, utterance length and sentence complexity of disfluency in preschool children who stutter and children who do not stutter (문장 따라말하기에서 말속도, 발화길이 및 통사적 복잡성에 따른 말더듬 아동과 일반아동의 비유창성 비교)

  • Kim, Yesul;Sim, Hyunsub
    • Phonetics and Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.53-64
    • /
    • 2021
  • According to Demand and Capacity Model (DCM), external and internal environments influence the disfluency of children who stutter (CWS). This study investigated the effects of simultaneous changes in motoric and linguistic demands on CWS and children who do not stutter (CWNS). Participants were 4-6 years old CWS and CWNS. A sentence imitation task with changes in speech rate, utterance length, and sentence complexity was used to examine their effects on children's disfluency. When the utterance length changed, CWS showed more disfluency regardless of utterance length and as the speech rate changed, CWS showed more disfluency at fast speech rate than CWNS. When the utterance length and speech rate changed, at fast speech rate, CWS showed more disfluency in both utterances than CWNS. When sentence complexity changed, CWS showed more disfluency than CWNS in complex sentences. Changes in linguistic elements such as speech rate, utterance length, and sentence complexity affect disfluency in CWS, especially when they were exposed to faster, longer, and more complex sentences. This indicates that CWS are vulnerable to fast and complex speech motor control and language processing ability than CWNS. Thus, this study suggests that parents and therapists consider both the speech rate and the utterance length when talking with CWS.

Implementation of Automatic Microphone Volume Controller and Recognition Rate Improvement (자동 입력레벨 조절기의 구현 및 인식 성능 향상)

  • 김상진;한민수
    • Proceedings of the IEEK Conference
    • /
    • 2001.09a
    • /
    • pp.503-506
    • /
    • 2001
  • In this paper, we describe the implementation of a microphone input level control algorithm and the speech improvement with this level controller in personal computer environment. The volume of speech obtained through a microphone affects the speech recognition rate directly. Therefore, proper input volume level control is desired fur better recognition. We considered some conditions for the successful volume controller implementation firstly, then checked its usefulness on our speech recognition system with common office environment speech database. Cepstral mean subtraction is also utilized far the channel-effect compensation of the database. Our implemented controller achieved approximately 50% reduction, i.e., improvement in speech recognition error rate.

  • PDF

On Detecting the Transition Regions of Phonemes by Using the Asymmetrical Rate of Speech Waveforms (음성파형의 비대칭율을 이용한 음소의 전이구간 검출)

  • Bae, Myung-Jin;Lee, Eul-jae;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.9 no.4
    • /
    • pp.55-65
    • /
    • 1990
  • To recognize continued speech, it is necessary to segment the connected acoustic signal into phonetic units, In this paper, as a parameter to detect transition regions in continued speech, we propose a new asymmetrical rate. The suggested rate represents a change rate of magnitude of speech signals. As comparing this rate with other rate in adjacent frame, the state of the frame can be distinguished between steady state and transient state.

  • PDF

An Improvement of Korean Speech Recognition Using a Compensation of the Speaking Rate by the Ratio of a Vowel length (모음길이 비율에 따른 발화속도 보상을 이용한 한국어 음성인식 성능향상)

  • 박준배;김태준;최성용;이정현
    • Proceedings of the IEEK Conference
    • /
    • 2003.11b
    • /
    • pp.195-198
    • /
    • 2003
  • The accuracy of automatic speech recognition system depends on the presence of background noise and speaker variability such as sex, intonation of speech, and speaking rate. Specially, the speaking rate of both inter-speaker and intra-speaker is a serious cause of mis-recognition. In this paper, we propose the compensation method of the speaking rate by the ratio of each vowel's length in a phrase. First the number of feature vectors in a phrase is estimated by the information of speaking rate. Second, the estimated number of feature vectors is assigned to each syllable of the phrase according to the ratio of its vowel length. Finally, the process of feature vector extraction is operated by the number that assigned to each syllable in the phrase. As a result the accuracy of automatic speech recognition was improved using the proposed compensation method of the speaking rate.

  • PDF

A New Speech Quality Measure for Speech Database Verification System (음성 인식용 데이터베이스 검증시스템을 위한 새로운 음성 인식 성능 지표)

  • Ji, Seung-eun;Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.3
    • /
    • pp.464-470
    • /
    • 2016
  • This paper presents a speech recognition database verification system using speech measures, and describes a speech measure extraction algorithm which is applied to this system. In our previous study, to produce an effective speech quality measure for the system, we propose a combination of various speech measures which are highly correlated with WER (Word Error Rate). The new combination of various types of speech quality measures in this study is more effective to predict the speech recognition performance compared to each speech measure alone. In this paper, we increase the system independency by employing GMM acoustic score instead of HMM score which is obtained by a secondary speech recognition system. The combination with GMM score shows a slightly lower correlation with WER compared to the combination with HMM score, however it presents a higher relative improvement in correlation with WER, which is calculated compared to the correlation of each speech measure alone.

An acoustical analysis of synchronous English speech using automatic intonation contour extraction (영어 동시발화의 자동 억양궤적 추출을 통한 음향 분석)

  • Yi, So Pae
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.97-105
    • /
    • 2015
  • This research mainly focuses on intonational characteristics of synchronous English speech. Intonation contours were extracted from 1,848 utterances produced in two different speaking modes (solo vs. synchronous) by 28 (12 women and 16 men) native speakers of English. Synchronous speech is found to be slower than solo speech. Women are found to speak slower than men. The effect size of speech rate caused by different speaking modes is greater than gender differences. However, there is no interaction between the two factors (speaking modes vs. gender differences) in terms of speech rate. Analysis of pitch point features has it that synchronous speech has smaller Pt (pitch point movement time), Pr (pitch point pitch range), Ps (pitch point slope) and Pd (pitch point distance) than solo speech. There is no interaction between the two factors (speaking modes vs. gender differences) in terms of pitch point features. Analysis of sentence level features reveals that synchronous speech has smaller Sr (sentence level pitch range), Ss (sentence slope), MaxNr (normalized maximum pitch) and MinNr (normalized minimum pitch) but greater Min (minimum pitch) and Sd (sentence duration) than solo speech. It is also shown that the higher the Mid (median pitch), the MaxNr and the MinNr in solo speaking mode, the more they are reduced in synchronous speaking mode. Max, Min and Mid show greater speaker discriminability than other features.

Comparison of Adult and Child's Speech Recognition of Korean (한국어에서의 성인과 유아의 음성 인식 비교)

  • Yoo, Jae-Kwon;Lee, Kyoung-Mi
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.5
    • /
    • pp.138-147
    • /
    • 2011
  • While most Korean speech databases are developed for adults' speech, not for children's speech, there are various children's speech databases based on other languages. Because there are wide differences between children's and adults' speech in acoustic and linguistic characteristics, the children's speech database needs to be developed. In this paper, to find the differences between them in Korean, we built speech recognizers using HMM and tested them according to gender, age, and the presence of VTLN(Vocal Tract Length Normalization). This paper shows the speech recognizer made by children's speech has a much higher recognition rate than that made by adults' speech and using VTLN helps to improve the recognition rate in Korean.

Speech signal processing in the auditory system (청각 계통에서의 음성신호처리)

  • 이재혁;심재성;백승화;박상희
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1987.10b
    • /
    • pp.680-683
    • /
    • 1987
  • The speech signal processing in the auditory system can be analysized based on two representations : Average discharge rate and Temporal discharge pattern. But the average discharge rate representation is restricted by the narrow dynamic range because of the rate saturation and the two tone suppression phenomena, and the temporal discharge pattern representation needs a sophisticate frequency analysis and synchrony measure. In this paper, a simple representation is proposed : using a model considering the interaction of Cochlear fluid-BM movement and a haircell model, the feature of speech signals (formant frequency and pitch of vowels) is easily estimated in the Average Synchronized Rate.

  • PDF

Experiment of VoIP Transmission with AMR Speech Codec in Wireless LAN (무선랜 환경에서 AMR 음성부호화기를 적용한 VoIP 전송 실험)

  • Shin, Hye-Jung;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.67-73
    • /
    • 2004
  • Packet loss, jitter, and delay in the Internet are caused mainly by the shortage of network bandwidth. It is due to queuing and routing process in the intermediate nodes of the packet network. In the Internet whose bandwidth is changing very rapidly in time depending on the number of users and data traffic, controlling the peak transmission bit-rate of a VoIP. system depending on the channel condition could be very helpful for making use of the available network bandwidth. Adapting packet size to the channel condition can reduce packet loss to improve the speech quality. It has been shown in [1] that a VoIP system with an AMR speech codec provides better speech quality than VoIP systems with fixed rate speech codecs. With the adaptive codec mode assignment. algorithm proposed in [1], in this paper, we performed the voice transmission experiments using the wireless LAN through the real Internet environment. Experimental results are analyzed and discussed with our findings.

  • PDF

Intelligibility Improvement of Low Bit-Rate Speech Coder Using Stochastic Spectral Equalizer (통계적 스펙트럼 이퀄라이저를 이용한 저 비트율 음성부호화기의 명료도 향상)

  • Lee, Jeong Hun;Yun, Deokgyu;Choi, Seung Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.10
    • /
    • pp.1183-1185
    • /
    • 2016
  • Low bit-rate speech coder in digital speech communications synthesizes speech using vocal tract model parameters. In this case, the spectra of the synthesized speech can be much distorted since the allocated bits for the parameters are considerably limited, which results in the degradation of speech intelligibility. In this paper, we propose a speech intelligibility improvement method using stochastic spectral equalizer. This method stochastically obtains the weight vector of each speech coder using spectral ratios between original and synthesized speech, then applies this weight vector to synthesized speech. From the experiments of objective speech intelligibility tests, we found that the performance of the proposed method is better than that of the conventional method.