• Title/Summary/Keyword: Speech rate

Search Result 1,245, Processing Time 0.026 seconds

Excitation Enhancement Based on a Selective-Band Harmonic Model for Low-Bit-Rate Code-Excited Linear Prediction Coders (저전송률 코드여기 선형 예측 부호화기를 위한 선택적 대역 하모닉 모델 기반 여기신호 개선 알고리즘)

  • Lee, Mi-Suk;Kim, Hong-Kook;Choi, Seung-Ho;Kim, Do-Young
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.259-269
    • /
    • 2004
  • In this paper, we propose a new excitation enhancement technique to improve the speech quality of low bit-rate code-excited linear prediction (CELP) coders. The proposed technique is based on a harmonic model and it is employed only in the decoding process of speech coders without any additional bits. We develop the procedure of harmonic model parameter estimation and harmonic generation, and apply this technique to a current state-of-the-art low bit rate speech coder, ITU-T G.729 Annex D. Also, its performance is measured by using the ITU-T P.862 PESQ score and compared to those of the phase dispersion filter and the long-term postfilter applied to the decoded excitation. It is shown that the proposed excitation enhancement technique can improve the quality of decoded speech and provide better quality for male speech than other techniques.

  • PDF

Very Low Bit Rate Speech Coder of Analysis by Synthesis Structure Using ZINC Function Excitation (ZINC 함수 여기신호를 이용한 분석-합성 구조의 초 저속 음성 부호화기)

  • Seo, Sang-Won;Kim, Young-Jun;Kim, Jong-Hak;Kim, Young-Ju;Lee, In-Sung
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.349-350
    • /
    • 2006
  • This paper presents very low bit rate speech coder, ZFE-CELP(ZINC Function Excitation-Code Excited Linear Prediction). The ZFE-CELP speech codec is based on a ZINC function and CELP modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. And this paper suggest strategies to improve the speech quality of the very low bit rate speech coder.

  • PDF

Real-time Implementation of Variable Transmission Bit Rate Vocoder Improved Speech Quality in SOLA-B Algorithm & G.729A Vocoder Using on the TMS320C5416 (TMS320C5416을 이용한 SOLA-B 알고리즘과 G.729A 보코더의 음질 향상된 가변 전송률 보코더의 실시간 구현)

  • Ham, Myung-Kyu;Bae, Myung-Jin
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.241-250
    • /
    • 2003
  • In this paper, we implemented the vocoder of variable rate by applying the SOLA-B algorithm to the G.729A to the TMS320C5416 in real-time. This method using the SOLA-B algorithm is that it is reduced the duration of the speech in encoding and is played at the speed of normal by extending the duration of the speech in decoding. But the method applied to the existed G.729A and SOLA-B algorithm is caused the loss of speech quality in G.729A which is not reflected about length variation of speech. Therefore the proposed method is encoded according as it is modified the structure of LSP quantization table about the length of speech is reduced by using the SOLA-B algorithm. The vocoder of variable rate by applying the G.729A and SOLA-B algorithm is represented the maximum complexity of 10.2MIPS about encoder and 2.8MIPS about decoder in 8kbps transmission rate. Also it is evaluated 17.3MIPS about encoder, 9.9MIPS about decoder in 6kbps and 18.5MIPS about encoder, 11.1MIPS about decoder in 4kbps according to the transmission rate. The used memory is about program ROM 9.7kwords, table ROM 4.69kwords, RAM 5.2kwords. The waveform of output is showed by the result of C simulator and Bit Exact. Also, the result of MOS test for evaluation of speech quality of the vocoder of variable rate which is implemented in real-time, it is estimated about 3.68 in 4kbps.

  • PDF

Speech Rate and Pauses in the Speech of Migrant Women from Multicultural Families (다문화가정 이주여성의 발화속도와 쉼)

  • Hwang, Ji-Sung;Lee, Sook-Hyang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.2
    • /
    • pp.63-72
    • /
    • 2012
  • The purpose of this paper is to provide basic data for development of Korean teaching programs for immigrant women from multicultural families through the acoustic analysis of their speech rate and pauses. They showed slower speech rate, longer pause duration, and higher frequency of pauses compared to a Korean women's group. Philippine women, whose residence duration in Korea is relatively longer than that of Vietnamese women, were more similar to Korean women. The slower speech rate of the immigrant women seems to be due to their slower articulation rate and their reading habit of inserting a pause after almost every word in a sentence.

Would Wernicke's Aphasic Speech Vary with Auditory Comprehension Abilities and/or Lesion Loci?

  • Kim, Hyang-Hee;Lee, Young-Mi;Na, Duk-L.;Chung, Chin-Sang;Lee, Kwang-Ho
    • Speech Sciences
    • /
    • v.13 no.1
    • /
    • pp.69-83
    • /
    • 2006
  • Speech characteristics of Wernicke's aphasia are characterized by such errors as empty speech, jargon, paraphasia, filler and others. However, not all the errors can be observed in each patient presumably due to diverse auditory comprehension (AC) abilities and/or lesion loci. The purpose of this study was, thus, to clarify the speech characteristics of Wernicke's aphasics according to the AC levels (i.e., better vs. worse) and lesion loci (i.e., Wernicke's area, WA vs. non-Wernicke's area, NWA). The authors divided 21 Wernicke's aphasic patients into four patient groups based on their AC levels and the lesion loci. The results showed that the four groups differed only in CIU (Correct Information Unit) rate. The patient groups with a better AC ability had higher CIU rates than the groups with a worse AC regardless of the lesion loci (e.g., WA or NWA). Therefore, it was concluded that CIU rate, the differentiating speech variable was most likely related to the AC levels, but not to lesion loci.

  • PDF

A Study on the Real Time Processing Technique of speech Signal (음성신호의 실시간 처리기법에 관한 연구)

  • Lee, Taek-Soo;Rhn, Chang;Kim, Sung-Nak;Rhee, Sang-Burm
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1094-1096
    • /
    • 1987
  • Zero-crossing analysis techniques have been applied to speech recognition. Zero-crossing rate, level-crossing rate and differentiated zero-crossing rate in time domain we used in analyzing speech signals. Speech samples could be stored in memory buffer in real time.

  • PDF

Prosody Control of the Synthetic Speech using Sampling Rate Conversion (표본화율 변환을 이용한 합성음의 운율제어)

  • 이현구;홍광석
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.676-679
    • /
    • 1999
  • In this paper, we presents a method to control prosody of the synthetic speech using sampling rate conversion technique. In prosody control, the conventional methods perform overlap and add. So the synthetic speech has a distortion and the voice quality is not satisfied. Using sampling rate conversion technique, we can get high Qualify of the synthetic speech. Also we can control various talking speeds according to speaker's patterns.

  • PDF

Study on the Improvement of Speech Recognizer by Using Time Scale Modification (시간축 변환을 이용한 음성 인식기의 성능 향상에 관한 연구)

  • 이기승
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.6
    • /
    • pp.462-472
    • /
    • 2004
  • In this paper a method for compensating for thp performance degradation or automatic speech recognition (ASR) is proposed. which is mainly caused by speaking rate variation. Before the new method is proposed. quantitative analysis of the performance of an HMM-based ASR system according to speaking rate is first performed. From this analysis, significant performance degradation was often observed in the rapidly speaking speech signals. A quantitative measure is then introduced, which is able to represent speaking rate. Time scale modification (TSM) is employed to compensate the speaking rate difference between input speech signals and training speech signals. Finally, a method for compensating the performance degradation caused by speaking rate variation is proposed, in which TSM is selectively employed according to speaking rate. By the results from the ASR experiments devised for the 10-digits mobile phone number, it is confirmed that the error rate was reduced by 15.5% when the proposed method is applied to the high speaking rate speech signals.

Hybrid Commanding Delta Modulation with Silence Detection (묵음 검출 기능을 사용한 하이브리드 압신 델타 변조기)

  • 조동호;은종관
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.19 no.6
    • /
    • pp.84-90
    • /
    • 1982
  • In this paper we exploit the use of the intermittent property of speech to reduce the transmission rate or to increase signal-to-quantization noise ratio (SQNR) in coding speech by hybrid companding data modulation (HCDM). In this scheme we detect silence in speech by a speech/silence discriminator. HCDM coding is done only for speech portion. For silence that is detected in evert block of 5 ms, only the information indicating that the Since the HCDM coder transmits bina교 signal synchronously at a fixed rate, the use of a buffer and its efficient control is essential. By using the HCDM with silence detection in coding speech, we could improve SONR by as much as 6 dB over the conventional HCDM or reduce the transmission rate by one third of the HCDM rate.

  • PDF

The realization of English rhythm by Busan Korean speakers

  • Choe, Wook Kyung
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.81-87
    • /
    • 2019
  • The purpose of the current study is to investigate the realization of speech rhythm in English as spoken by Korean learners of English. The study particularly aims to examine the rhythm metrics of English read speech by learners who speak Busan or the South Kyungsang dialect of Korean. Twenty-four learners whose L1 is Busan Korean and eight native speakers of English read a passage wherein five sentences were segmented and labeled as vocalic and intervocalic intervals. Various rhythm metrics such as %V, Varcos, and Pairwise Variability Indexes (PVIs) were calculated. The results show that Korean learners read English sentences with significantly more vocalic and consonantal intervals at a slower speech rate than native English speakers. The analyses of rhythm metrics revealed that when the speech rate was not normalized, Korean learners' English showed more variability in the length of consonantal and vocalic intervals. However, speech-rate-normalized rhythm metrics for vocalic intervals indicated that Korean learners transferred their L1 rhythmic structures (a syllable-timed language) into their L2 speech (a stress-timed language). Overall, the results suggest that Korean learners' English reflects the rhythmic characteristics of their L1. The effect of the learners' L1 dialect on the realization of L2 speech rhythm is also speculated.