• Title/Summary/Keyword: Speech rate

검색결과 1,242건 처리시간 0.034초

저전송률 코드여기 선형 예측 부호화기를 위한 선택적 대역 하모닉 모델 기반 여기신호 개선 알고리즘 (Excitation Enhancement Based on a Selective-Band Harmonic Model for Low-Bit-Rate Code-Excited Linear Prediction Coders)

  • 이미숙;김홍국;최승호;김도영
    • 음성과학
    • /
    • 제11권2호
    • /
    • pp.259-269
    • /
    • 2004
  • In this paper, we propose a new excitation enhancement technique to improve the speech quality of low bit-rate code-excited linear prediction (CELP) coders. The proposed technique is based on a harmonic model and it is employed only in the decoding process of speech coders without any additional bits. We develop the procedure of harmonic model parameter estimation and harmonic generation, and apply this technique to a current state-of-the-art low bit rate speech coder, ITU-T G.729 Annex D. Also, its performance is measured by using the ITU-T P.862 PESQ score and compared to those of the phase dispersion filter and the long-term postfilter applied to the decoded excitation. It is shown that the proposed excitation enhancement technique can improve the quality of decoded speech and provide better quality for male speech than other techniques.

  • PDF

ZINC 함수 여기신호를 이용한 분석-합성 구조의 초 저속 음성 부호화기 (Very Low Bit Rate Speech Coder of Analysis by Synthesis Structure Using ZINC Function Excitation)

  • 서상원;김영준;김종학;김영주;이인성
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2006년도 하계종합학술대회
    • /
    • pp.349-350
    • /
    • 2006
  • This paper presents very low bit rate speech coder, ZFE-CELP(ZINC Function Excitation-Code Excited Linear Prediction). The ZFE-CELP speech codec is based on a ZINC function and CELP modeling of the excitation signal respectively according to the frame characteristic such as a voiced speech and an unvoiced speech. And this paper suggest strategies to improve the speech quality of the very low bit rate speech coder.

  • PDF

TMS320C5416을 이용한 SOLA-B 알고리즘과 G.729A 보코더의 음질 향상된 가변 전송률 보코더의 실시간 구현 (Real-time Implementation of Variable Transmission Bit Rate Vocoder Improved Speech Quality in SOLA-B Algorithm & G.729A Vocoder Using on the TMS320C5416)

  • 함명규;배명진
    • 음성과학
    • /
    • 제10권3호
    • /
    • pp.241-250
    • /
    • 2003
  • In this paper, we implemented the vocoder of variable rate by applying the SOLA-B algorithm to the G.729A to the TMS320C5416 in real-time. This method using the SOLA-B algorithm is that it is reduced the duration of the speech in encoding and is played at the speed of normal by extending the duration of the speech in decoding. But the method applied to the existed G.729A and SOLA-B algorithm is caused the loss of speech quality in G.729A which is not reflected about length variation of speech. Therefore the proposed method is encoded according as it is modified the structure of LSP quantization table about the length of speech is reduced by using the SOLA-B algorithm. The vocoder of variable rate by applying the G.729A and SOLA-B algorithm is represented the maximum complexity of 10.2MIPS about encoder and 2.8MIPS about decoder in 8kbps transmission rate. Also it is evaluated 17.3MIPS about encoder, 9.9MIPS about decoder in 6kbps and 18.5MIPS about encoder, 11.1MIPS about decoder in 4kbps according to the transmission rate. The used memory is about program ROM 9.7kwords, table ROM 4.69kwords, RAM 5.2kwords. The waveform of output is showed by the result of C simulator and Bit Exact. Also, the result of MOS test for evaluation of speech quality of the vocoder of variable rate which is implemented in real-time, it is estimated about 3.68 in 4kbps.

  • PDF

다문화가정 이주여성의 발화속도와 쉼 (Speech Rate and Pauses in the Speech of Migrant Women from Multicultural Families)

  • 황지성;이숙향
    • 한국음향학회지
    • /
    • 제31권2호
    • /
    • pp.63-72
    • /
    • 2012
  • 본 연구는 다문화가정 베트남과 필리핀 이주여성의 발화속도와 쉼 특성에 대한 음향학적인 분석을 통하여 이주여성 대상 한국어교육 프로그램 개발의 기초자료를 제공하고자 하였다. 이주여성은 한국여성에 비해 느린 발화속도, 긴 쉼 지속시간과 높은 쉼 빈도를 나타냈다. 베트남집단보다 상대적으로 한국 거주기간이 긴 필리핀집단은 한국집단에 보다 가까운 특성을 보였다. 이주여성의 발화속도가 느리게 나타난 것은 조음속도가 느리고 거의 어절마다 쉼을 두고 읽는 습관에 기인한 것으로 보인다.

Would Wernicke's Aphasic Speech Vary with Auditory Comprehension Abilities and/or Lesion Loci?

  • Kim, Hyang-Hee;Lee, Young-Mi;Na, Duk-L.;Chung, Chin-Sang;Lee, Kwang-Ho
    • 음성과학
    • /
    • 제13권1호
    • /
    • pp.69-83
    • /
    • 2006
  • Speech characteristics of Wernicke's aphasia are characterized by such errors as empty speech, jargon, paraphasia, filler and others. However, not all the errors can be observed in each patient presumably due to diverse auditory comprehension (AC) abilities and/or lesion loci. The purpose of this study was, thus, to clarify the speech characteristics of Wernicke's aphasics according to the AC levels (i.e., better vs. worse) and lesion loci (i.e., Wernicke's area, WA vs. non-Wernicke's area, NWA). The authors divided 21 Wernicke's aphasic patients into four patient groups based on their AC levels and the lesion loci. The results showed that the four groups differed only in CIU (Correct Information Unit) rate. The patient groups with a better AC ability had higher CIU rates than the groups with a worse AC regardless of the lesion loci (e.g., WA or NWA). Therefore, it was concluded that CIU rate, the differentiating speech variable was most likely related to the AC levels, but not to lesion loci.

  • PDF

음성신호의 실시간 처리기법에 관한 연구 (A Study on the Real Time Processing Technique of speech Signal)

  • 이택수;안창;김성락;이상범
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 1987년도 전기.전자공학 학술대회 논문집(II)
    • /
    • pp.1094-1096
    • /
    • 1987
  • Zero-crossing analysis techniques have been applied to speech recognition. Zero-crossing rate, level-crossing rate and differentiated zero-crossing rate in time domain we used in analyzing speech signals. Speech samples could be stored in memory buffer in real time.

  • PDF

표본화율 변환을 이용한 합성음의 운율제어 (Prosody Control of the Synthetic Speech using Sampling Rate Conversion)

  • 이현구;홍광석
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1999년도 추계종합학술대회 논문집
    • /
    • pp.676-679
    • /
    • 1999
  • In this paper, we presents a method to control prosody of the synthetic speech using sampling rate conversion technique. In prosody control, the conventional methods perform overlap and add. So the synthetic speech has a distortion and the voice quality is not satisfied. Using sampling rate conversion technique, we can get high Qualify of the synthetic speech. Also we can control various talking speeds according to speaker's patterns.

  • PDF

시간축 변환을 이용한 음성 인식기의 성능 향상에 관한 연구 (Study on the Improvement of Speech Recognizer by Using Time Scale Modification)

  • 이기승
    • 한국음향학회지
    • /
    • 제23권6호
    • /
    • pp.462-472
    • /
    • 2004
  • 본 논문에서는 자동 음성 인식기의 성능 저하를 일으키는 요인으로서 발성 속도의 변동에 따를 성능 저하를 보상하기 위한 기법을 제안하였다. 새로운 기법의 제안에 앞서서. 먼저 발성 속도의 변화에 따른 기존의 은닉 마코프 모델을 이용한 음성 인식기의 성능을 정량적으로 분석하였다. 이러한 분석을 통해 발성 속도에 따른 유의한 성능 저하를 관찰하고, 주어진 음성으로부터 발성 속도를 정량적으로 나타낼 수 있는 변수를 도입하였다. 발성 속도를 학습 시 사용한 음성과 유사하게 변화시키기 위해 본 논문에서는 음성 신호에 대한 시간축 변환을 사용하였으며, 최종적으로 발성 속도에 따라 선택적으로 시간축 변환을 적용하여 발성 속도의 변동에 따른 음성 인식의 성능 저하를 보상할 수 있는 기법을 제안하였다. 10자리의 이동통신용 전화번호를 이용한 음성 인식의 실험을 통해, 제안된 기법은 빠르게 발성하는 음성에 대해 15.5%의 오류율 감소를 가져오는 것을 확인할 수 있었다.

묵음 검출 기능을 사용한 하이브리드 압신 델타 변조기 (Hybrid Commanding Delta Modulation with Silence Detection)

  • 조동호;은종관
    • 대한전자공학회논문지
    • /
    • 제19권6호
    • /
    • pp.84-90
    • /
    • 1982
  • 본 논물에서는 HCDM(hybrid companding delta modulation)을 사용하여 음성을 부호화할 때, 음성의 간헐성을 이용하여 전송속도를 줄이거나 잡음에 대한 신호비(SQNR)을 증가시키는 연구를 하였다. 음성부분과 묵음(silence)부분을 식별하는 판별기를 이용하여 음성의 묵음부분을 검출하며, 이때 음성부분에 대해서는 HCDM 부호화를 행한다. 음성을 5msec 간격으로 검사하여, 그때 검출되는 묵음부분에 대해서는 그 구간이 묵음이라는 정도만을 전송하며, 수신단에서는 이 정보를 이용하여 묵음부불을 재생한다. 그런데 HCDM 부호기는 2진 신호를 일정한 속도로 또 동기적으로 전송하기 때문에, 버퍼 (buffer)를 사용해야 하며 또한 그것을 효율적으로 제어해야 한다. 음성을 부호화할 때, 묵음검출 기능을 이용하는 HCDM 부호기를 사용하면, 재래의 HCDM 보다 잡음에 대한 신호비를 6dB 만큼 증가시킬 수 있거나, 전송속도를 1/3가량 줄일 수 있다.In this paper we exploit the use of the intermittent property of speech to reduce the transmission rate or to increase signal-to-quantization noise ratio (SQNR) in coding speech by hybrid companding data modulation (HCDM). In this scheme we detect silence in speech by a speech/silence discriminator. HCDM coding is done only for speech portion. For silence that is detected in evert block of 5 ms, only the information indicating that the Since the HCDM coder transmits bina교 signal synchronously at a fixed rate, the use of a buffer and its efficient control is essential. By using the HCDM with silence detection in coding speech, we could improve SONR by as much as 6 dB over the conventional HCDM or reduce the transmission rate by one third of the HCDM rate.

  • PDF

The realization of English rhythm by Busan Korean speakers

  • Choe, Wook Kyung
    • 말소리와 음성과학
    • /
    • 제11권4호
    • /
    • pp.81-87
    • /
    • 2019
  • The purpose of the current study is to investigate the realization of speech rhythm in English as spoken by Korean learners of English. The study particularly aims to examine the rhythm metrics of English read speech by learners who speak Busan or the South Kyungsang dialect of Korean. Twenty-four learners whose L1 is Busan Korean and eight native speakers of English read a passage wherein five sentences were segmented and labeled as vocalic and intervocalic intervals. Various rhythm metrics such as %V, Varcos, and Pairwise Variability Indexes (PVIs) were calculated. The results show that Korean learners read English sentences with significantly more vocalic and consonantal intervals at a slower speech rate than native English speakers. The analyses of rhythm metrics revealed that when the speech rate was not normalized, Korean learners' English showed more variability in the length of consonantal and vocalic intervals. However, speech-rate-normalized rhythm metrics for vocalic intervals indicated that Korean learners transferred their L1 rhythmic structures (a syllable-timed language) into their L2 speech (a stress-timed language). Overall, the results suggest that Korean learners' English reflects the rhythmic characteristics of their L1. The effect of the learners' L1 dialect on the realization of L2 speech rhythm is also speculated.