• Title/Summary/Keyword: Speech coder

Search Result 166, Processing Time 0.02 seconds

Wideband Speech Coding Algorithm with Application of Wavelet Transform (웨이브렛 변환을 적용한 광대역 음성부호화 알고리즘)

  • 이승원;배건성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.5
    • /
    • pp.462-470
    • /
    • 2002
  • Wideband speech, characterized by a bandwidth of 50∼7000 ㎐, sounds more natural and intelligible, and is less tiring to listen to when compared to narrowband speech characterized by a bandwidth of 300∼3400 ㎐. Wideband speech coders, however, have not been as successful as the narrowband speech coders because of their higher bit rate. In this paper, we propose a new wideband speech coder which combines the European standard of a narrowband speech coder, i.e., GSM-EFR, and a transform coder using the discrete wavelet transform. The proposed wideband speech coder operates as follows input speech is first split into two subbands with equal bandwidth and the two subband signals are coded and decoded by each subband coder. A GSM-EFR is adopted as a lower subband coder and a subband coder with wavelet transformed speech is designed for a upper subband coder. The total bit rate of the proposed coder is 18.9kbps (12.2 kbps for lower band coder and 6.7 kbps for upper band coder), and informal listening test results have shown that the proposed coder has comparable speech quality to that of G.722 with 56 kbps.

A MFCC-based CELP Speech Coder for Server-based Speech Recognition in Network Environments (네트워크 환경에서 서버용 음성 인식을 위한 MFCC 기반 음성 부호화기 설계)

  • Lee, Gil-Ho;Yoon, Jae-Sam;Oh, Yoo-Rhee;Kim, Hong-Kook
    • MALSORI
    • /
    • no.54
    • /
    • pp.27-43
    • /
    • 2005
  • Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is shown that the performance of speech recognition using the proposed speech coder is better than that using G.729.

  • PDF

Evaluation Performance of Speech Coder in Speech Signal Processing

  • Lee, Kwang-Seok
    • Journal of information and communication convergence engineering
    • /
    • v.5 no.2
    • /
    • pp.177-180
    • /
    • 2007
  • We compared CS-ACELP with QCELP speech coder in CDMA cellular under channel error environment and experimented performance with its measured value under channel error environment. Also, we specified the effective coding scheme to overcome. CS-ACELP speech coder using a LSP vector quantizer shows transparent speech quality from the results that SD is 0.92dB and outlier frames over 2dB is 2.9% in the BER 0.10% condition. CS-ACELP speech coder which is utilizing MA predictor shows better results on SVR and SEGSNR than QCELP speech coder(IS-96) adopting DPCM type predictor when bit error occurs from BER 0.01% to 0.50%.

Complexity Reduction Algorithm of Speech Coder(EVRC) for CDMA Digital Cellular System

  • Min, So-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.12
    • /
    • pp.1551-1558
    • /
    • 2007
  • The standard of evaluating function of speech coder for mobile telecommunication can be shown in channel capacity, noise immunity, encryption, complexity and encoding delay largely. This study is an algorithm to reduce complexity applying to CDMA(Code Division Multiple Access) mobile telecommunication system, which has a benefit of keeping the existing advantage of telecommunication quality and low transmission rate. This paper has an objective to reduce the computing complexity by controlling the frequency band nonuniform during the changing process of LSP(Line Spectrum Pairs) parameters from LPC(Line Predictive Coding) coefficients used for EVRC(Enhanced Variable-Rate Coder, IS-127) speech coders. Its experimental result showed that when comparing the speech coder applied by the proposed algorithm with the existing EVRC speech coder, it's decreased by 45% at average. Also, the values of LSP parameters, Synthetic speech signal and Spectrogram test result were obtained same as the existing method.

  • PDF

Design and implementation of a speech coder for CDMA cellular system (CDMA 이동통신 시스템용 음성부호화기 설계 및 구현)

  • 장석진;윤병식;김재원;이원명;윤병우;이인성;최송인;임명섭;한기철
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.10
    • /
    • pp.72-79
    • /
    • 1996
  • We developed a speech coder that can transfer data as well as speech for CDMA digital cellular system. We describe the design method of the speech coder that uses QCELP algorithm for speech coding. The speech coder is implemented on a single fixed-point DSP chip (TMS320C50). the coder has the complexity such as 4K words in RAM, 10K words in ROM, and 33 MIPS in execution time. The developed speech coder is fully tested and successfully working on the CDMA base station system.

  • PDF

Real-time Implementation of a GSM-EFR Speech Coder on a 16 Bit Fixed-point DSP (16 비트 고정 소수점 DSP를 이용한 GSM-EFR 음성 부호화기의 실시간 구현)

  • 최민석;변경진;김경수
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.7
    • /
    • pp.42-47
    • /
    • 2000
  • This paper describes a real-time implementation of a GSM-EFR (Global System for Mobil communications Enhanced Full Rate) speech coder using OakDSP core; a 16bit fixed-point Digital Signal Processor (DSP) by DSP Group, Inc. The real-time implemented speech coder required about 24MIPS for computation and 7.06K words and 12.19K words for code and data memory, respectively. The implemented GSM-EFR speech coder passes all of test vectors provided by ETSI (European Telecommunication Standard Institute), and perceptual speech quality measurement using MNB algorithm shows that the quality of the GSM-EFR speech coder is similar to the one of 32kbps ADPCM. The real-time implemented GSM-EFR speech coder which is the highest bit-rate mode of the GSM-AMR speech coder will be used as the basic structure of the GSM-AMR speech coder which is embedded in MODEM ASIC of IMT2000 asynchronous mode mobile station.

  • PDF

A Half Rate Speech Soder using Trellis Excitation (Trellis excitation을 이용한 half rate 음성부호화기)

  • 강상원;이형수;김영수;정진욱
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.2
    • /
    • pp.88-94
    • /
    • 1996
  • In this paper, we present a half rate speech coder using trellis excitation. The coder combines code-excited linear prediction (CELP) system and trellis quantization method using the codebook expansion, and it produces higher speech quality than the typical CELP coder for the same transmission rate. A subjective comparison with 3~8 bit .$\mu$-law PCM indicates that the half rate coder provides speech quality between 5-bit and 6-bit $\mu$-law PCM .

  • PDF

Digital Speech Coding Technologies for Wire and Wireless Communication (유무선망에서 사용되는 디지털 음성 부호화 기술 동향)

  • Yoon, Byungsik;Choi, Songin;Kang, Sangwon
    • Journal of Broadcast Engineering
    • /
    • v.10 no.3
    • /
    • pp.261-269
    • /
    • 2005
  • Throughout the history of digital communication, the digital speech coder is used as speech compression tool. Nowadays, the speech coder has been rapidly developed in the area of mobile communication system to overcome severe channel error and limitation of radio frequency resources. Due to the development of high performance communication system, high quality of speech coder is needed. This kind of speech coder can be used not only in communication services but also in digital multimedia services. In this paper, we describe the technologies of digital speech coder which are used in wire and wireless communication. We also present a summary of recent speech coding standards for narrowband and wideband applications. Finally we introduce the technical trends of next generation speech coder.

Design of a Low Bit-rate Speech Coder Based on Mixed Multi-band Excitation Model (혼합 다중대역 여기모델에 기반한 저 전송률 음성 부호화기의 설계)

  • 한우진;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.6
    • /
    • pp.510-521
    • /
    • 2002
  • MBE (multi-band excitation) coder can achieve high qualify synthetic speech below 4.0 kbps. There are, however, significant differences of the fine structure between the original spectrum and the synthetic spectrum. They are mainly due to the exclusive partition of voiced and unvoiced regions in frequency domain and the decision procedure based on the experimental threshold. This paper proposes MMBE (mixed multi-band excitation) speech model to overcome drawbacks of a MBE coder. In addition, two analysis methods, which do not need my decision procedure based on a threshold, are presented. Both voiced and unvoiced components can be mixed over all the frequency axis in the MMBE speech model. To illustrate the potential of the proposed speech model, we develop a 2.6 kbps MMBE coder and compare it with a 2.9 kbps MBE coder by both objective and subjective methods. The results have shown that the proposed coder has a better performance even at a lower bit-rate compared with the MBE coder.

Performance Analysis of A Variable Bit Rate Speech Coder (가변 비트율 음성 부호화기의 성능분석)

  • Iem, Byeong-Gwan
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.12
    • /
    • pp.1750-1754
    • /
    • 2013
  • A variable bit rate speech coder is presented. The coder is based on the observation that a speech signal can be viewed as a combination of piecewise linear signals in a short time period. The encoder detects the sample points where the slope of the signal changes, which are called the inflection points in this paper. The coder transmits the location and value for the detected inflection sample, but only the location information for the noninflection samples. In the decoder, the noninflection samples are estimated with interpolation of the received information. Several factors affecting the performance of the coder have been tested through simulation. Simulation results show that the linear interpolation produces 1 ~ 5 dB improvement over the cubic spline interpolation. And the -law companding does not provide any benefit when it is applied before the inflection detection. With low threshold values in the inflection point detection, the coder shows better MOS and more than 16 dB improvement in SNR compared to the continuously variable slope delta modulation (CVSDM).