• Title/Summary/Keyword: Speech coding

Search Result 303, Processing Time 0.028 seconds

Channel Coder Implementation and Performance Analysis for Speech Coding: Considering bit Importance of Speech Information-part III (음성 부호기용 채널 부호화기의 구현 및 성능 분석)

  • 강법주;김선영;김상천;김영식
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.4
    • /
    • pp.484-490
    • /
    • 1990
  • In speech coding scheme, because information bits have different error sensitivities over channel errors, the channel coder for combining with speech coding should be realized by the variable coding rate considering the bit importance of speech information bits. In realizing the 4 kbps channel coder for 12kbps speech, this paper have chosen the channel coding method by analyzing the hard-decision post-decoding error rate of RCPC(Rate Compatible Punctured Convolutional) codes and bit error sensitivity of 12 kbps speech. Under the coherent QPSK and Rayleigh fading channel, the performance analysis has showed that 10dB gain was obtained in speech SEGSNR by 4-level uneuqal error protection, which was compared with the caseof no channel coding at 7dB channel SNR.

  • PDF

Implementation of Quad Variable Rates ADPCM Speech CODEC on C6000 DSP considering the Environmental Noise (배경잡음을 고려한 4배 가변 압축률을 갖는 ADPCM의 C6000 DSP 실시간 구현)

  • Kim Dae-Sung;Han Kyong-ho
    • Proceedings of the KIPE Conference
    • /
    • 2002.07a
    • /
    • pp.727-729
    • /
    • 2002
  • In this paper, we proposed quad variable rates ADPCM coding method and its implementation on C6000 DSP, which is modified from the standard ADPCM of ITU G.726 for speech quality improvement considering the environmental noise Four coding rates, 16Kbps, 24Kbps, 32Kbps and 40Kbps are used for speech window samples and the rate decision threshold is decided by the environmental noise level. The object of the proposed method is to reduce the coding rate while retaining the speech quality and the speech quality is considerably close to 40Kbps single rate coder with the coding rate close to 16Kbps single rate coder under the environmental noise. The environmental noise level affects the coding rate and the noise level is calculated per every speech window samples. At high noise level, more samples are coded at higher rates to enhance the quality, but at low noise level, only the big speech signals are coded at higher rates and more speech samples are coded at lower coding rates to reduce the coding rates. The influence of the noise on tile speech signal is considerably high for small signals and the small signal has the higher ZCR (zero crossing rate). The method is simulated in PC and to be implemented on C6000 floating point DSP board in real time operations.

  • PDF

Coding Method of Variable Threshold Dual Rate ADPCM Speech Considering the Background Noise (배경 잡음환경에서 가변 임계값에 의한 Dual Rate ADPCM 음성 부호화 기법)

  • 한경호
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.17 no.6
    • /
    • pp.154-159
    • /
    • 2003
  • In this paper, we proposed variable threshold dual rate ADPCM coding method which adapts two coding rates of the standard ADPCM of ITU G.726 for speech quality improvement at a comparably low coding rates. The ZCR(Zero Crossing Rate) is computed for speecd data and under the noisy environment, noise data dominant region showed higher ZCR and speech data dominant region showed lower ZCR. The speech data with the higher ZCR is encoded by low coding rate for reduced coded data and the speech data with the lower ZCR is encoded by high coding rate for speech quality improvements. For coded data, 2 bits are assigned for low coding rate of 16[Kbps] and 5 bits are is assigned for high coding rate of 40[Kbps]. Through the simulation, the proposed idea is evaluated and shown that the variable dual rate ADPCM coding technique shows the qood speech quality at low coding rate.

Detection and Synthesis of Transition Parts of The Speech Signal

  • Kim, Moo-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.3C
    • /
    • pp.234-239
    • /
    • 2008
  • For the efficient coding and transmission, the speech signal can be classified into three distinctive classes: voiced, unvoiced, and transition classes. At low bit rate coding below 4 kbit/s, conventional sinusoidal transform coders synthesize speech of high quality for the purely voiced and unvoiced classes, whereas not for the transition class. The transition class including plosive sound and abrupt voiced-onset has the lack of periodicity, thus it is often classified and synthesized as the unvoiced class. In this paper, the efficient algorithm for the transition class detection is proposed, which demonstrates superior detection performance not only for clean speech but for noisy speech. For the detected transition frame, phase information is transmitted instead of magnitude information for speech synthesis. From the listening test, it was shown that the proposed algorithm produces better speech quality than the conventional one.

An Embedded ACELP Speech Coding Based on the AMR-WB Codec

  • Byun, Kyung-Jin;Eo, Ik-Soo;Jeong, Hee-Bum;Hahn, Min-Soo
    • ETRI Journal
    • /
    • v.27 no.2
    • /
    • pp.231-234
    • /
    • 2005
  • This letter proposes a new embedded speech coding structure based on the Adaptive Multi-Rate Wideband (AMR-WB) standard codec. The proposed coding scheme consists of three different bitrates where the two lower bitrates are embedded into the highest one. The embedded bitstream was achieved by modifying the algebraic codebook search procedure adopted for the AMR-WB codec. The proposed method provides the advantage of scalability due to the embedded bitstream, while it inevitably requires some additional computational complexity for obtaining two different code vectors of the higher bitrate modes. Compared to the AMR-WB codec, the embedded coder shows improved speech qualities for two higher bitrate modes with a slightly increased bitrate caused by the decreased coding efficiency of the algebraic codebook.

  • PDF

On a Pitch Alteration Technique by Cepstrum Analysis of Flattened Excitation Spectrum (평탄화된 여기 스펙트럼에서 켑스트럼 피치 변경법에 관한 연구)

  • 조왕래
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.159-162
    • /
    • 1998
  • Speech synthesis coding is classified into three categories: waveform coding, source coding and hybrid coding. To obtain the synthetic speech with high quality, the synthesis by waveform coding is desired. However, it is difficult to apply waveform coding to synthesis by syllable or phoneme unit, because it does not divide the speech into excitation and formant component. Thus it is required to alter the excitation in waveform coding for applying waveform coding to synthesis by rule. In this paper we propose a new pitch alteration method that minimizes the spectrum distortion by using the behavior of cepstrum. This method splits the spectrum of speech signal into excitation spectrum and formant spectrum and transforms the excitation spectrum into cepstrum domain. The pitch of excitation cepstrum is altered by zero insertion or zero deletion and the pitch altered spectrum is reconstructed in spectrum domain. As a result of performance test, the average spectrum distortion was below 2.29%.

  • PDF

Simulation of speech processing and coding strategy for cochlear implants (인공 청각 장치의 음성신호 처리와 자극방법의 시뮬레이션)

  • Kim, Young-Hoon;Park, Kwang-Suk
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1991 no.11
    • /
    • pp.30-33
    • /
    • 1991
  • The object of speech processor for cochlear implants is to deliver speech information to the central nerve system. In this study we have presented the method which simulate speech processing and coding strategy for cochlear implants and simulated two different processing methods to the 12 adults with normal ears. The formant sinusoidal coding was better than the formant pulse coding In the consonant perception test and learning effects.(p < 0.05)

  • PDF

Single-Mode-Based Unified Speech and Audio Coding by Extending the Linear Prediction Domain Coding Mode

  • Beack, Seungkwon;Seong, Jongmo;Lee, Misuk;Lee, Taejin
    • ETRI Journal
    • /
    • v.39 no.3
    • /
    • pp.310-318
    • /
    • 2017
  • Unified speech and audio coding (USAC) is one of the latest coding technologies. It is based on a switchable coding structure, and has demonstrated the highest levels of performance for both speech and music contents. In this paper, we propose an extended version of USAC with a single-mode of operation-which does not require a switching system-by extending the linear prediction-coding mode. The main concept of this extension is the adoption of the advantages of frequency-domain coding schemes, such as windowing and transition control. Subjective test results indicate that the proposed scheme covers speech, music, and mixed streams with adequate levels of performance. The obtained quality levels are comparable with those of USAC.

Variable Rate CELP Coding with Phonetic Segmentation using LPC Vector Quantization (LPC 벡터 양자화를 이용한 가변률 CELP 음성코딩에 관한 연구)

  • 정영호
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.205-209
    • /
    • 1994
  • This paper presents a variable rate speech coding method with phonetic segmentation, called for PSVXC. Multiple access techniques that require efficient encoding of speech to achieve capacity improvements are currently emerging in the cellular telephone system. The variable rate speech coder have the reduced average data rate required to transmit conversational speech. Each frame of active speech is classified into one of four phonetic classes. A distinct coding configuration and bit-rate is applied to each category. And also a split vector quantization is used to accurately quantize the LPC information using LSP parameters.

  • PDF

Speech Quality of a Sinusoidal Model Depending on the Number of Sinusoids

  • Seo, Jeong-Wook;Kim, Ki-Hong;Seok, Jong-Won;Bae, Keun-Sung
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.17-29
    • /
    • 2000
  • The STC(Sinusoidal Transform Coding) is a vocoding technique that uses a sinusoidal speech model to obtain high- quality speech at low data rate. It models and synthesizes the speech signal with fundamental frequency and its harmonic elements in frequency domain. To reduce the data rate, it is necessary to represent the sinusoidal amplitudes and phases with as small number of peaks as possible while maintaining the speech quality. As a basic research to develop a low-rate speech coding algorithm using the sinusoidal model, in this paper, we investigate the speech quality depending on the number of sinusoids. By varying the number of spectral peaks from 5 to 40 speech signals are reconstructed, and then their qualities are evaluated using spectral envelope distortion measure and MOS(Mean Opinion Score). Two approaches are used to obtain the spectral peaks: one is a conventional STFT (Short-Time Fourier Transform), and the other is a multiresolutional analysis method.

  • PDF