• Title/Summary/Keyword: Speech coding

Search Result 303, Processing Time 0.024 seconds

On a Performance Evaluation of the Pitch Alteration Techniques of speech waveform coding (피치 변경법의 성능평가)

  • Kim, Hong;Bae, Seong-Gyun;Jo, Wang-Rae;Bae, Myung-Jin
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06c
    • /
    • pp.103-106
    • /
    • 1994
  • Generally we are used to apply waveform coding method obtaining the high quality synthesized speech. But we have to solve the problems, memory capacity and pitch alteration, for applying the waveform coding method to speech synthesis by rule. The former problem is conquered by improving the integrated semiconductor technology, but the latter problem remains. In this paper, we compare the methods that have proposed for pitch alteration in our laboratory until now. These methods are not change properties of vocal tract formants and only altered the pitch halving method, 1.14% for cepstrum analysis method, and 2.36% for hamonics compensated with the phase method.

  • PDF

Voice Packet Conversion from 13kbps QCELP to 8kbps QCELP Speech Codecs (13kbps QCELP에서 8kbps QCELP로의 음성 패킷 변환 기술)

  • 박호종;권상철
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.6
    • /
    • pp.71-76
    • /
    • 1999
  • In digital cellular communication systems, tandem coding occurs in communications between mobile phones with different speech codecs, resulting in poor voice quality, high computational load, and long transmission delay. In this paper, voice packet conversion technique is proposed to solve the tandem coding problems, and packet conversion algorithm from 13kbps QCELP to 8kbps QCELP is developed. Simulations using various speech data show that the proposed packet conversion method produces voice quality which is equivalent to that by the conventional tandem coding method with shorter transmission delay using about 33% computational load.

  • PDF

An ACLMS-MPC Coding Method Integrated with ACFBD-MPC and LMS-MPC at 8kbps bit rate. (8kbps 비트율을 갖는 ACFBD-MPC와 LMS-MPC를 통합한 ACLMS-MPC 부호화 방식)

  • Lee, See-woo
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.1-7
    • /
    • 2018
  • This paper present an 8kbps ACLMS-MPC(Amplitude Compensation and Least Mean Square - Multi Pulse Coding) coding method integrated with ACFBD-MPC(Amplitude Compensation Frequency Band Division - Multi Pulse Coding) and LMS-MPC(Least Mean Square - Multi Pulse Coding) used V/UV/S(Voiced / Unvoiced / Silence) switching, compensation in a multi-pulses each pitch interval and Unvoiced approximate-synthesis by using specific frequency in order to reduce distortion of synthesis waveform. In integrating several methods, it is important to adjust the bit rate of voiced and unvoiced sound source to 8kbps while reducing the distortion of the speech waveform. In adjusting the bit rate of voiced and unvoiced sound source to 8 kbps, the speech waveform can be synthesized efficiently by restoring the individual pitch intervals using multi pulse in the representative interval. I was implemented that the ACLMS-MPC method and evaluate the SNR of APC-LMS in coding condition in 8kbps. As a result, SNR of ACLMS-MPC was 15.0dB for female voice and 14.3dB for male voice respectively. Therefore, I found that ACLMS-MPC was improved by 0.3dB~1.8dB for male voice and 0.3dB~1.6dB for female voice compared to existing MPC, ACFBD-MPC and LMS-MPC. These methods are expected to be applied to a method of speech coding using sound source in a low bit rate such as a cellular phone or internet phone. In the future, I will study the evaluation of the sound quality of 6.9kbps speech coding method that simultaneously compensation the amplitude and position of multi-pulse source.

Variable Quad Rate ADPCM for Efficient Speech Transmission and Real Time Implementation on DSP (효율적인 음성신호의 전송을 위한 4배속 가변 변환율 ADPCM기법 및 DSP를 이용한 실시간 구현)

  • 한경호
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.18 no.1
    • /
    • pp.129-136
    • /
    • 2004
  • In this paper, we proposed quad variable rates ADPCM coding method for efficient speech transmission and real time porcessing is implemented on TMS320C6711-DSP. The modified ADPCM with four variable coding rates, 16[kbps], 24[kbps], 32[kbps] and 40[kbps] are used for speech window samples for good quality speech transmission at a small data bits and real time encoding and decoding is implemented using DSP. ZCR is used to identify the influence of the noise on the speech signal and to decide the rate change threshold. For noise superior signals, low coding rates are applied to minimize data bit and for noise inferior signals, high coding rates are applied to enhance the speech quality. In most speech telecommunications, silent period takes more than half of the signals, speech quality close to 40[kbps] can be obtained at comparabley low data bits and this is shown by simulation and experiments. TMS320C6711-DSK board has 128K flash memory and performance of 1333MIPS and has meets the requirements for real time implementation of proposed coding algorithm.

The Korean Text-to-speech Using Syllable Units (음절 단위를 이용한 한국어 음성 합성)

  • 김병수;윤기선;박성한
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.1
    • /
    • pp.143-150
    • /
    • 1990
  • In this paper, a rule-based method for improving the intelligibility of synthetic speech is proposed. A 12-pole linear prediction coding method is used to model syllable speech signals. A syllable concatenation rule for pause and frame rejection between syllables is developed to improve the naturalness of the synthetic speech. In addition, phonoligical structure transform rule and prosody rule are applied to the synthetic speech by LPC. The illustrative results demonstrate that the synthetic speech obtained by applying these rules has better naturalness than the synthetic speech by LPC.

  • PDF

A Study on 8kbps FBD-MPC Method Considering Low Bit Rate (Low Bit Rate을 고려한 8kbps FBD-MPC 방식에 관한 연구)

  • Lee, See-Woo
    • Journal of Digital Convergence
    • /
    • v.12 no.6
    • /
    • pp.271-276
    • /
    • 2014
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and unvoiced consonants in a frame. In this paper, I propose a method of 8kbps Multi-Pulse Speech Coding(FBD-MPC: Frequency Band Division MPC) by using TSIUVC(Transition Segment Including Unvoiced Consonant) searching, extraction and approximation-synthesis method in a frequency domain. I evaluate the 8kbps MPC and FBD-MPC. As a result, SNRseg of FBD-MPC was improved 0.5dB for female voice and 0.2dB for male voice respectively. Compared to the MPC, SNRseg of FBD-MPC has been improved that I was able to control the distortion of the speech waveform finally. And so, I expect to be able to this method for cellular phone and smart phone using excitation source of low bit rate.

Research on Open Source Encoding Technology for MPEG Unified Speech and Audio Coding (MPEG 통합 음성/오디오 코덱을 위한 오픈 소스 부호화 기술에 관한 연구)

  • Song, Jeongook;Lee, Joonil;Kang, Hong-Goo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.1
    • /
    • pp.86-96
    • /
    • 2013
  • Unified Speech and Audio Coding (USAC) is the speech/audio codec with the best quality, approved on Final Draft International Standard (FDIS) at MPEG meeting in 2011. Since MPEG conventionally standardizes only the decoder, it is not easy to study on the encoder technologies. Furthermore, Reference Model(RM) shows extremely poor performance. To solve these problems, the open source project(JAME) proposes the methods to make the improved performance of main encoder technologies in USAC. Especially, this paper introduces the encoder modules: the signal classifier for selective operation between two coders, the psychoacoustic model in frequency domain, and window transition technology. Finally, the results of verification test for FDIS and the performance of Common Encoder are appended.

Real-Time Implementation of the EHSX Speech Coder Using a Floating Point DSP (부동 소수점 DSP를 이용한 4kbps EHSX 음성 부호화기의 실시간 구현)

  • 이인성;박동원;김정호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.5
    • /
    • pp.420-427
    • /
    • 2004
  • This paper presents real time implementation of 4kbps EHSX (Enhanced Harmonic Stochastic Excitation) speech coder that combines the harmonic vector excitation coding with time-separated transition coding. The harmonic vector excitation coding uses the harmonic excitation coding for voiced frames and used the vector excitation coding with the structure of analysis-by-synthesis for unvoiced frames, respectively. For transition frames mixed with voiced and unvoiced signal, we use the time-separated transition coding. In this paper. we present the optimization methods of implementation speech coder on the EMS320C6701/sup (R)/ DSP. To reduce the complex for real-time implementation. we perform the optimization method in algorithm by replacing the complex sinusoidal synthesis method with IFFT. and we apply fully pipelines hand assembly coding after converting it from floating source to fixed source. To generate a more efficient code. we also make use or the available EMS320C6701/sup (R)/ resources such as Fastest67x library and memory organization.

Robust Tree Coding Combined with Harmonic Scaling of Speech at 4.8 Kbps (견실한 배음 축척과 결합된 4.8KBPS 트리 음성부호기)

  • 강상원;이인성;한경호
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1806-1814
    • /
    • 1993
  • Efficient speech coders using tree coding combined with harmonic scaling are designed at the rate of 4.8 kilobitts/sec (kbps). A time domain harmonic scaling algorithm (TDHS) is used to compress input speech by a factor of two. This process allows the tree coder have 1.5 bits/sample for 4.8 kbps in the case of a 6.4 kHz sampling rate. In the backward adaptive tree coder, there are three components of the code generator, including a hybrid adaptive quantizer, a short-term predictor and a pitch predictor. The robustness of the tree coder is achieved by carefully choosing the input of the short term predictor adaptation. Also, inclusion of a smoother in the pitch predictor improves the error performance of tree coder in the noisy channel. Subjectively, tree coding combined with TDHS provides good quality speech at 4.8 kbps.

  • PDF

Design of Wideband Speech Coder Compatible with CS-ACELP (CS-ACELP와 호환성을 갖는 광대역 음성 부호화기 설계)

  • 김동주;이인성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.4
    • /
    • pp.52-57
    • /
    • 2000
  • In this paper, we designed the 16 Kbps speech coder that has compatibility with CS-ACELP algorithm(G.729). The speech signal is sampled at rate of 16 KHz, divided into two narrowband signal by QMF filterbank, and decimated to rate of 8 KHz. The lower-band signal is encoded by CS-ACELP and the upper-band signal is encoded by Adaptive Transform Coding(ATC) algorithm. At the receiver, two band signals are synthesized by decoder of CS-ACELP and ATC, respectively. The reconstructed output is obtained by passing the QMF synthesis bank. The proposed wideband coder is evaluated with ITU-T G.722 coder through the Mean Opinion Score(MOS) test.

  • PDF