Search | Korea Science

Low Rate Speech Coding Using the Harmonic Coding Combined with CELP Coding (하모닉 코딩과 CELP방법을 이용한 저 전송률 음성 부호화 방법)

김종학;이인성
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.3
- /
- pp.26-34
- /
- 2000
In this paper, we propose a 4kbps speech coder that combines the harmonic vector excitation coding with time-separated transition coding. The harmonic vector excitation coding uses the harmonic excitation coding in the voiced frame and uses the vector excitation coding with the structure of analysis-by-synthesis in the unvoiced frame, respectively. But two mode coding method is not effective for transition frame mixed in voiced and unvoiced signal and a new method beyond using unvoiced/voiced mode coding is needed. Thus, we designed a time-separated transition coding method for transition frame in which a voiced/unvoiced decision algorithm separates unvoiced and voiced duration in a frame, and harmonic-harmonic excitation coding and vector-harmonic excitation coding method is selectively used depending on the previous frame U/V decision. In the decoder, the voiced excitation signals are generated efficiently through the inverse FFT of harmonic magnitudes and the unvoiced excitation signals are made by the inverse vector quantization. The reconstructed speech signal are synthesized by the Overlap/Add method.
PDF

Multi Mode Harmonic Transform Coding for Speech and Music

Kim, Jonghark;Shin, Jae-Hyun;Lee, Insung
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.3E
- /
- pp.101-109
- /
- 2003
A multi-mode harmonic transform coding (MMHTC) for speech and music signals is proposed. Its structure is organized as a linear prediction model with an input of harmonic and transform-based excitation. The proposed coder also utilizes harmonic prediction and an improved quantizer of excitation signal. To efficiently quantize the excitation of music signals, the modulated lapped transform(MLT) is introduced. In other words, the coder combines both the time domain (linear prediction) and the frequency domain technique to achieve the best perceptual quality. The proposed coder showed better speech quality than that of the 8 kbps QCELP coder at a bit-rate of 4 kbps.
PDF KSCI

Real-Time Implementation of the EHSX Speech Coder Using a Floating Point DSP (부동 소수점 DSP를 이용한 4kbps EHSX 음성 부호화기의 실시간 구현)

이인성;박동원;김정호
- The Journal of the Acoustical Society of Korea
- /
- v.23 no.5
- /
- pp.420-427
- /
- 2004
This paper presents real time implementation of 4kbps EHSX (Enhanced Harmonic Stochastic Excitation) speech coder that combines the harmonic vector excitation coding with time-separated transition coding. The harmonic vector excitation coding uses the harmonic excitation coding for voiced frames and used the vector excitation coding with the structure of analysis-by-synthesis for unvoiced frames, respectively. For transition frames mixed with voiced and unvoiced signal, we use the time-separated transition coding. In this paper. we present the optimization methods of implementation speech coder on the EMS320C6701/sup (R)/ DSP. To reduce the complex for real-time implementation. we perform the optimization method in algorithm by replacing the complex sinusoidal synthesis method with IFFT. and we apply fully pipelines hand assembly coding after converting it from floating source to fixed source. To generate a more efficient code. we also make use or the available EMS320C6701/sup (R)/ resources such as Fastest67x library and memory organization.
PDF KSCI

A Voice/Unvoice Decomposition in Noisy Background (이중 여진 음성모델을 이용한 음질개선)

유창동
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06c
- /
- pp.175-178
- /
- 1998
음질개선에 이줄 여진(Double Excitation) 음성모델을 적용하는 방법이 있다. 유성음과 무성음 성분들로 분리하는 이 방법은 각 성분들의 고유한 성질을 이용하여 음질을 저하시키는 wideband 잡음을 제거할 수 있다. 이중 여진 음성모델을 이용한 음질개선 시스팀과 기존의 스펙트랄 제거(spectal subtraction) 알고리즘을 비공식적으로 비교한 결과 이중 여진 모델을 이용한 방법이 더 나은 성능을 보였다.
PDF

Spectrum Based Excitation Extraction for HMM Based Speech Synthesis System (스펙트럼 기반 여기신호 추출을 통한 HMM기반 음성합성기의 음질 개선 방법)

Lee, Bong-Jin;Kim, Seong-Woo;Baek, Soon-Ho;Kim, Jong-Jin;Kang, Hong-Goo
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.1
- /
- pp.82-90
- /
- 2010
This paper proposes an efficient method to enhance the quality of synthesized speech in HMM based speech synthesis system. The proposed method trains spectral parameters and excitation signals using Gaussian mixture model, and estimates appropriate excitation signals from spectral parameters during the synthesis stage. Both WB-PESQ and MUSHRA results show that the proposed method provides better speech quality than conventional HMM based speech synthesis system.
https://doi.org/10.7776/ASK.2010.29.1.082 인용 PDF KSCI

Design of a Low Bit-rate Speech Coder Based on Mixed Multi-band Excitation Model (혼합 다중대역 여기모델에 기반한 저 전송률 음성 부호화기의 설계)

한우진;오영환
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.6
- /
- pp.510-521
- /
- 2002
MBE (multi-band excitation) coder can achieve high qualify synthetic speech below 4.0 kbps. There are, however, significant differences of the fine structure between the original spectrum and the synthetic spectrum. They are mainly due to the exclusive partition of voiced and unvoiced regions in frequency domain and the decision procedure based on the experimental threshold. This paper proposes MMBE (mixed multi-band excitation) speech model to overcome drawbacks of a MBE coder. In addition, two analysis methods, which do not need my decision procedure based on a threshold, are presented. Both voiced and unvoiced components can be mixed over all the frequency axis in the MMBE speech model. To illustrate the potential of the proposed speech model, we develop a 2.6 kbps MMBE coder and compare it with a 2.9 kbps MBE coder by both objective and subjective methods. The results have shown that the proposed coder has a better performance even at a lower bit-rate compared with the MBE coder.
PDF KSCI

Enhaced 2.4 kbps Harmonic Stochastic Excitation Coding for Time/Frequency Transitional Speech (시간/주파수 전이신호를 위한 향상된 2.4 kbps 하모닉 스토케스틱 여기 음성 부호화 방법)

김종학;이인성
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.7
- /
- pp.53-58
- /
- 2000
본 논문은 주파수 전이신호와 시간 전이 신호에 대해서 고조파 잡음 여기 방법과 시간 분리 여기 방법을 적용한 2.4 kbps 음성부호화 방법을 제안한다. 혼합 여기 부호화 방법은 주기 신호와 비 주기 신호를 효과적으로 표현하기 위해 하모닉 잡음 모델을 사용한다. 혼합신호에 대한 잡음 성분은 캡스트럴 분석 방법을 사용함으로써 추출되고, AR (Autoregressive Model) 모델에 의해 표현된다. 시간 전이구간 신호에서의 모호한 음성을 효과적으로 제거하기 위한 또 다른 방법이 제안된다. 제안된 시간 분리 방법은 시간 에너지 변화정도를 관찰함으로써 전이 시점을 감지하고 다른 시간 길이를 가지는 두 블록으로 분리하여 분석한다. 시간 분리 방법은 분석을 위한 비대칭 윈도우와 합성에서의 위상 합성 방법을 포함한다. 제안된 방법을 사용한 2.4 kbps 음성부호화 방법은 주관적 음질 평가에서 전이구간에서의 지각적 음질의 향상을 보여주었으며, 원본 음성 스펙트럼과의 고조파 비 매칭에 의한 윙윙거리는 기계적인 잡음을 감소시킨다.
PDF

On Improving the Quality of RELP Vocoder (RELP Vocoder의 음질 향상에 관한 연구)

오성근;은종관
- The Journal of the Acoustical Society of Korea
- /
- v.5 no.1
- /
- pp.11-16
- /
- 1986
지금까지 알려진 여러 가지 음성부호화 방식들 중 4.8에서 9.6kbits/s 사이의 전송속도에서 제일 좋은 성능을 갖는 것은 Residual-Excited linear prediction 방식이다. RELP 부호화 방식은 전송속도가 낮을 때 합성음이 거칠거나 금속성의 잡음을 갖는 단점이 있다. 본 논문에서는 이러한 단점을 보완하여 음질을 개선하는 세가지의 방법들을 제안하며, 그들은 다음과 같다. 첫째는 여러개의 baseband를 이용한 spectral folding 방법이고, 둘째는 spectral folding 방법과 pulsed excitation 방법을 조합한 방법이며, 마 지막 방법은 여러개의 baseband를 사용한 spectral folding 방법과 pulsed excitation 방법을 조합한 방법 이다. 이 방법들을 사용하여 RELP vocoder의 음질을 많이 개선할 수 있으며, 9.6kbits/s 근처의 전송속 도에서 사용하기 위한 첫 번째 방법과 세 번째 방법은 spectral fording 이나 nonlinear distortion 방법 에서 문제가 되는 roughness 나 tonal noise를 거의 인지 할 수 없으며, 세 번째 방법이 첫 번째 방법보 다 우수하다. 두 번째 방법은 4.8 kbits/s 근처의 전송속도에 적합하며, 기존의 RELP 방식들에 비해 많 은 음질향상을 가져왔다. 제안한 세가지 방법들을 같은 조건에서 비교할 때 세 번째 방법이 가장 우수 하며, 이 경우 합성음은 원음과 거의 흡사하다.
PDF

A Speech Coder using the Simplified Multi-mode Method (단순화된 다중 모드 방법을 이용한 음성 부호화기)

강홍구
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1995.06a
- /
- pp.146-149
- /
- 1995
This paper proposes a SM-CELP speech coder which applies different excitation signal according to the characteristic of speech segment at bit-rate below 4 kbps. Speech signal is divided with 2 modes such as stationary voice and etc. using the parameters of average energy of the short-time speech and the residual signal after long term prediction. Structured multi-pulse method is used for the excitation of mode-A and gaussian or pulse-like codebook for mode-B. 4.8kbps DoD-CELP are used to evaluate the performance of the proposed coder. As a result, the propose method shows 1~2 dB higher segmental signal to noise ratio and better subjectional quality without increasing the computational amount.
PDF

The Analysis of Vehicle Interior Noise by the Powertrain, and Measurement of Noise Trasnsfer Function using Vibro-Acoustic Reciprocity (파워트레인에 의한 차량 실내 소음 특성 및 전달 함수 측정)

Kim, Sung-Jong;Lee, Sang-Kwon
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2007.05a
- /
- pp.501-506
- /
- 2007
Structure-borne noise is the interior noise that results from the low frequency vibrational energy transmitted through those body and joint parts. The relation between the excitation of powertrain and resultant interior sound must be analyzed in order to identify and predict the structure borne noise. The method of acoustic source excitation is preferred than the method of mechanical force excitation to measure the NTF(noise transfer function). Because acoustical method is more convenient and reliable. In this paper, to analysis and identify vehicle interior noise by powertrain is performed, and the vibro-acoustic transfer function is extracted from experimental measurement. These are important step of TPA(transfer path analysis) to identify effect of interior noise resulted from powertrain running excitation.
PDF

Search Result 105, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)