• Title/Summary/Keyword: vocoding

Search Result 11, Processing Time 0.03 seconds

On Speech Digitization and Bandwidth Compression Techniques[II]-Vocoding (음성신호의 디지탈화와 대역폭축소의 방법에 관하여 [II]-Vocoding)

  • 은종관
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.15 no.5
    • /
    • pp.1-6
    • /
    • 1978
  • This paper is a sequel of the previous paperl) on speech digitization and bandwidth compression techniques. Several recently developed vocoding techniques, that is, linear predictive coding (LPC), formant vocoding, residual excited linear prediction (RELP) vocoding, and adaptive predictive coding(APC) are discussed. Throughout the leaper emphasis is placed on the LPC approach that is presently the most promising technique in speech compression. In addition, current problems and possible solutions are discussed.

  • PDF

Korean ESL Learners' Perception of English Segments: a Cochlear Implant Simulation Study (인공와우 시뮬레이션에서 나타난 건청인 영어학습자의 영어 말소리 지각)

  • Yim, Ae-Ri;Kim, Dahee;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.91-99
    • /
    • 2014
  • Although it is well documented that patients with cochlear implant experience hearing difficulties when processing their first language, very little is known whether or not and to what extent cochlear implant patients recognize segments in a second language. This preliminary study examines how Korean learners of English identify English segments in a normal hearing and cochlear implant simulation conditions. Participants heard English vowels and consonants in the following three conditions: normal hearing condition, 12-channel noise vocoding with 0mm spectral shift, and 12-channel noise vocoding with 3mm spectral shift. Results confirmed that nonnative listeners could also retrieve spectral information from vocoded speech signal, as they recognized vowel features fairly accurately despite the vocoding. In contrast, the intelligibility of manner and place features of consonants was significantly decreased by vocoding. In addition, we found that spectral shift affected listeners' vowel recognition, probably because information regarding F1 is diminished by spectral shifting. Results suggest that patients with cochlear implant and normal hearing second language learners would experience different patterns of listening errors when processing their second language(s).

On Speech Digitization and Bandwidth Compression Techniques[II]-Vocoding (음성신호의 디지탈화와 대역폭축소의 방법에 관하여[II]-Vocoding)

  • 은종관
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.15 no.6
    • /
    • pp.1-7
    • /
    • 1978
  • This paper deals with speech digitization and bandwidth compression techniques, particularly two predictive coding methods-namely, adaptive differential pulse code modulation(ADPCM) and adaptive delta modulation(ADM). The principle of a typical adaptive quantizer that is used in ADPCM is explained, and discussed. Also, three companding methods(instantaueous, syllabic, and hybrid companding) that are used in ADM are explained in detail, and their performances are compared. In addition, the performances of ADPCM and ADM as speech coders are compared, and the inerits of each coder are discussed.

  • PDF

A PERFORMANCE STUDY OF SPEECH CODERS FOR TELEPHONE CONFERENCING IN DIGITAL MOBILE COMMUNICATION NETWORKS

  • Lee, M.S.;Lee, G.C.;Kim, K.C.;Lee, H.S.;Lyu, D.S.;Shin, D.J.;Lee, Hun
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.899-903
    • /
    • 1994
  • This paper describes two methods to assess the output speech, quality of vocoders for telephone conferencing in digital mobile communication networks. The proposed methods are the sentence discrimiantion method and the modified degraded mean opinion score (MDMOS) test. We apply these two methods to Qualcomm code excited linear prediction (QCELP), vector sum excited linear prediction (VSELP) and regular pulse excited-long term predictin (RPE-LTD) voceders to evaluate which vocoding algorithm can process mixed voice signal from two speakers better for telephone conferencing. From the experiments we obtain that the VSELP vocoding algorithm reveals superior output speech quality to the other two.

  • PDF

ON IMPROVING THE QUALITY OF RELP VOCODER

  • Oh, S.K.
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1985.10a
    • /
    • pp.79-86
    • /
    • 1985
  • Residual-ecited linear prediction vocoding is known to be one of the best approaches to speech coding in the range of 4.8 to 9.6 kbits/s. One problem associated with the RELP vocoder is that it often produces some roughness and tonal noise as the transmission rate becomes lower. In this paper, we investigate three methods to improve its quality. These include the multiband spectral folding method, the method of using both the spectrally folded signal and the pulsed ecitation signal, and the method of using both the multiband spectrally folded signal and the pulsed ecitation signal. It has been found that, among the three methods, the last one yields the best performance. It produces no roughness and little tonal noise.

  • PDF

Speech Quality of a Sinusoidal Model Depending on the Number of Sinusoids

  • Seo, Jeong-Wook;Kim, Ki-Hong;Seok, Jong-Won;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.17-29
    • /
    • 2000
  • The STC(Sinusoidal Transform Coding) is a vocoding technique that uses a sinusoidal speech model to obtain high- quality speech at low data rate. It models and synthesizes the speech signal with fundamental frequency and its harmonic elements in frequency domain. To reduce the data rate, it is necessary to represent the sinusoidal amplitudes and phases with as small number of peaks as possible while maintaining the speech quality. As a basic research to develop a low-rate speech coding algorithm using the sinusoidal model, in this paper, we investigate the speech quality depending on the number of sinusoids. By varying the number of spectral peaks from 5 to 40 speech signals are reconstructed, and then their qualities are evaluated using spectral envelope distortion measure and MOS(Mean Opinion Score). Two approaches are used to obtain the spectral peaks: one is a conventional STFT (Short-Time Fourier Transform), and the other is a multiresolutional analysis method.

  • PDF

Improvement of Bit Rate applying the Speaking Rate and PSOLA Technique of Speech in CELP Vocoder (음성신호의 발성율과 PSOLA기법을 적용한 음성 보코더 전송률 개선에 관한 연구)

  • 장경아;서지호;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.45-48
    • /
    • 2003
  • In general, speech coding methods are classified into the following three categories: the waveform coding, the source coding and the hybrid coding. Fast speaking is possible to encode with a few information compared with slow speaking rate. In case of speaking rate, low frequency band is more important than high frequency band while listening. Speech vocoding technique is developing to way with low bit rate and complexity and high sound quality. the CELP type of vocoder support very good sound quality with low bit rate but these vocoders don't consider about the speaking rate. When we consider speaking rate and encode the frame depending on the speaking rate, the bit rate is able to reduce the bit rate than the conventional vocoder. We propose the technique to estimate the speaking rate and applied PSOLA technique in case of the frame of slow speaking rate. As a result of simulation bit rate can be reduced about 300 bps.

  • PDF

Fixed Point Implementation of the QCELP Speech Coder

  • Yoon, Byung-Sik;Kim, Jae-Won;Lee, Won-Myoung;Jang, Seok-Jin;Choi, Song_in;Lim, Myoung-Seon
    • ETRI Journal
    • /
    • v.19 no.3
    • /
    • pp.242-258
    • /
    • 1997
  • The Qualcomm code excited linear prediction (QCELP) speech coder was adopted to increase the capacity of the CDMA Mobile System (CMS). In this paper, we implemented the QCELP speech coding algorithm by using TMS320C50 fixed point DSP chip. Also the fixed point simulation was done with C language. The computation complexity of QCELP on TMS320C50 was 10k words and data memory was 4k words. In the normal call test on the CMS, where mobile to mobile call test was done in the bypass mode without double vocoding, mean opinion score for the speech quality was he Qualcomm code excited linear prediction (QCELP) speech quality was 3.11.

  • PDF

Implementation of Voice Codec using APC Algorithm for INMARSAT-B (APC(Adaptive Predictive Coder) 알고리즘을 응용한 INMARSAT-B Voice Codec구현)

  • Lee, Chae-Ho;Hwang, Yun-Ho;Kim, Jeong-Hun;Lim, Jong-Kun;Bae, Jung-Chul;Choi, Woo-Jin;Lee, Joon-Tark
    • Proceedings of the KIEE Conference
    • /
    • 1999.07g
    • /
    • pp.3246-3248
    • /
    • 1999
  • The APC is a coding algorithm which has the middle property of both Wave Coding(ex ADPCM) and Vocoding(ex CELP) and can decode a proper quality of sound by using scalar quantizer instead of vector quantizer at computation a low calculation. So, the APC required for Voice Codec of INMARSAT-B could be successfully implemented by full duplex using TMS32OC30(DSP).

  • PDF

A New EGG System Design and Speech Analysis for Quantitative Analysis of Human Glottal Vibration Patterns (성문진동 패턴의 정량적인 해석을 위한 새로운 시스템 설계와 음성분석)

  • 김종찬;이재천;김덕원;오명환;윤대희;차일환
    • Journal of Biomedical Engineering Research
    • /
    • v.20 no.4
    • /
    • pp.427-433
    • /
    • 1999
  • The purpose of the study is to develop an improved pitch extraction method that can be used in a variety of speech applications such as high-puality compression and vocoding, and recognition and synthesis of speech. To do so, we develop a new electroglottograph (EGG) measurement system that is based on the four modulation-demodulation type spot electrodes for detecting the EGG signals. Then, the glottal closure instant(GCI) is determined from the EGG signals on a real-time basis. We can obtain the pitch contour using the information on the GCI. It turns out that the new pitch contour algorithm (PCA) operates more reliably as compared to the conventional speech-only-based algorithm. In addition, we study the speech source models and glottal vibratory patterns for Koreans by measuring and analyzing the diversified vibration patterns of the vocal from the EGG signals.

  • PDF