• 제목/요약/키워드: Speech coding

검색결과 303건 처리시간 0.02초

PSOLA 전처리과정을 이용한 G.723.1 보코더의 전송률 감소에 관한 연구 (On a Study of the Reduction of Bit Rate by the Preprocessing of PSOLA Coding Technique in the G. 723.1 Vocoder)

  • 장경아;조성현;배명진
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(4)
    • /
    • pp.401-404
    • /
    • 2002
  • In general, speech coding methods are classified into the following three categories: the waveform coding, the source coding and the hybrid coding. In this paper, First, the reference waveform is detected after searching the pitch period by NAMDF similarity and similarity between the reference waveform and the waveform each pitch period. It made a decision whether the waveform is compressed with the threshold of similarity. If the waveform is compressed only magnitude and pitch information is transmitted into the input of G.723.1 vocoder. Performing through the G.723.1 vocoder, the waveform is restored with the magnitude and pitch information by PSOLA synthesis method. The result of simulation with proposed algorithm has a 31% reduction of bit rate than the standard 5.3kbps G.723.1 ACELP vocoder.

  • PDF

AMR과 EVRC 음성 부호화기간의 비탠덤 방식을 이용한 상호 부호화 (Tandemless Transcoding for AMR and EVRC Speech Coders)

  • 이선일;유창동
    • 한국음향학회지
    • /
    • 제21권6호
    • /
    • pp.531-542
    • /
    • 2002
  • 본 논문에서는 AMR과 EVRC 음성 부호화기간의 비탠덤 (Tandemless) 방식을 이용한 상호 부호화 방법이 제안되었다. 제안된 방법은 기존의 탠덤 (Tandem) 방식의 상호 부호화 방법과 달리 음성 신호를 다시 복호화했다가 부호화하지 않고, CELP 계열의 음성 부호화기들이 공통적으로 사용하는 파라미터들을 직접 변환한다. 상호 부호화는 LSP 변환, 적응 코드북을 위한 피치 지연 값 및 적응 코드북 이득 변환, 고정 코드북 벡터 및 고정 코드북 이득 변환으로 구성되어 있다. 제안된 방법을 객관적, 주관적 방법으로 평가한 결과 기존의 탠덤 방식에 비하여 적은 계산량과 지연 시간으로 탠덤 방식과 최소 동등, 혹은 우월한 음질을 얻을 수 있다는 것을 확인했다.

Statistical Extraction of Speech Features Using Independent Component Analysis and Its Application to Speaker Identification

  • Jang, Gil-Jin;Oh, Yung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • 제21권4E호
    • /
    • pp.156-163
    • /
    • 2002
  • We apply independent component analysis (ICA) for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basis functions so that each source component is statistically independent. The learned basis functions are oriented and localized in both space and frequency, bearing a resemblance to Gabor wavelets. These features are speaker dependent characteristics and to assess their efficiency we performed speaker identification experiments and compared our results with the conventional Fourier-basis. Our results show that the proposed method is more efficient than the conventional Fourier-based features in that they can obtain a higher speaker identification rate.

Fixed Point Implementation of the QCELP Speech Coder

  • Yoon, Byung-Sik;Kim, Jae-Won;Lee, Won-Myoung;Jang, Seok-Jin;Choi, Song_in;Lim, Myoung-Seon
    • ETRI Journal
    • /
    • 제19권3호
    • /
    • pp.242-258
    • /
    • 1997
  • The Qualcomm code excited linear prediction (QCELP) speech coder was adopted to increase the capacity of the CDMA Mobile System (CMS). In this paper, we implemented the QCELP speech coding algorithm by using TMS320C50 fixed point DSP chip. Also the fixed point simulation was done with C language. The computation complexity of QCELP on TMS320C50 was 10k words and data memory was 4k words. In the normal call test on the CMS, where mobile to mobile call test was done in the bypass mode without double vocoding, mean opinion score for the speech quality was he Qualcomm code excited linear prediction (QCELP) speech quality was 3.11.

  • PDF

Statistical Extraction of Speech Features Using Independent Component Analysis and Its Application to Speaker Identification

  • 장길진;오영환
    • 한국음향학회지
    • /
    • 제21권4호
    • /
    • pp.156-156
    • /
    • 2002
  • We apply independent component analysis (ICA) for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basis functions so that each source component is statistically independent. The learned basis functions are oriented and localized in both space and frequency, bearing a resemblance to Gabor wavelets. These features are speaker dependent characteristics and to assess their efficiency we performed speaker identification experiments and compared our results with the conventional Fourier-basis. Our results show that the proposed method is more efficient than the conventional Fourier-based features in that they can obtain a higher speaker identification rate.

Statistical Error Compensation Techniques for Spectral Quantization

  • Choi, Seung-Ho;Kim, Hong-Kook
    • 음성과학
    • /
    • 제11권4호
    • /
    • pp.17-28
    • /
    • 2004
  • In this paper, we propose a statistical approach to improve the performance of spectral quantization of speech coders. The proposed techniques compensate for the distortion in a decoded line spectrum pairs (LSP) vector based on a statistical mapping function between a decoded LSP vector and its corresponding original LSP vector. We first develop two codebook-based probabilistic matching (CBPM) methods based on linear mapping functions according to different assumption of distribution of LSP vectors. In addition, we propose an iterative procedure for the two CBPMs. We apply the proposed techniques to a predictive vector quantizer used for the IS-641 speech coder. The experimental results show that the proposed techniques reduce average spectral distortion by around 0.064dB.

  • PDF

디지털 음성방식의 성능 비교에 대한 연구 (A Study on the Comparison of Digital Speech Coding Performance)

  • 배철수
    • 한국통신학회논문지
    • /
    • 제17권8호
    • /
    • pp.881-890
    • /
    • 1992
  • 본 논문은 음성 시스템과 통신망에서 이용되는 음성 품질 평가 모델의 구축을 위한 기본 연구로서, 음성 부호화 평가 방법 중 주관적 평가에서 발생되는 여러 문제점을 해결하여 안정된 객관적 평가값을 얻기위해서, 여러 객관적 평가량과 주관적 평가량을 상호 비교한 후, 주관적 평가값에 적합한 객관적 평가량을 검토하였다.

  • PDF

파형 부호와 방식에 의한 정보압축과 퍼포먼스에 관한 연구 (The study on the information compression by coding method and its performance)

  • 안동순
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1985년도 학술발표회 논문집
    • /
    • pp.68-71
    • /
    • 1985
  • In this paper, Sentence-Sip E Il Ka Gi Seo U1 E Gan Da was spoken by 4 men and 3 see sound is used for the experiment. A/D conversion time is 30 sec. Data are obtained using the microcomputer and compressed by ADPCM Rate of compression is 1/8. Data compressed by ADPCM are synthesized and compared to the original sound. Rate of speech identification is analysed using the sound pressure, white noise. Coding of ADPCM is done for 5bit. As the result of fixing starting voltage by 2.6V. It is acertained that variable value increases in initial speech signal and then process is made by minimum value "3". From the result of processing, synthesized sound is almost eaual to original sound. Minimum values cause distorition, Dummy Head System is used in this experiment.xperiment.

  • PDF

통합 음성/오디오 부호화기의 Noise Filling 알고리즘에 대한 연구 (Study on Noise Filling algorithm of Unified Speech and Audio Coding)

  • 송정욱;강홍구
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 2012년도 하계학술대회
    • /
    • pp.260-261
    • /
    • 2012
  • 본 논문에서는 Unified Speech and Audio Coding (USAC)에 적용된 Noise Filling의 부호화 과정에서 음질 왜곡 정도에 따라 Noise level을 설정하는 방법을 제안한다. USAC는 Moving Picture Experts Group (MPEG)에서 표준화한 최신의 음성/오디오 통합 코덱으로 현존하는 코덱 중에 최고의 성능을 가지고 있다. 하지만, 복호화기 기술만 표준화하여, 인코더를 설계하는 방법에 따라 음질의 차이가 존재한다 현재 오픈 소스 기반으로 진행되고 있는 프로젝트 JAME에서는 이러한 음질 차이를 극복하고, USAC에 적용된 핵섬 인코더 기술의 성능을 최대화 할 수 있는 여러 가지 방법을 포함하고 있다. 그 중 Noise Filling은 저 전송률 부호화 과정에서 양자화 되지 않는 스펙트럼에 대하여 일정한 noise level을 넣어 인지적으로 음질을 향상시키는 방법이다. 제안된 Noise Filling 부호화 방법은 현재 프레임의 음질 왜곡 정도를 반영하여, noise-like 신호 성분을 더욱 정교하게 부호화 할 수 있게 하였다.

  • PDF

A FAST METHOD FOR CODEBOOK SEARCH IN VSELP CODING

  • Sung Joo Kim
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.943-948
    • /
    • 1994
  • The vector sum excited linear prediction(VSELP) coding gives high quality of synthetic speech at bit rates as low as 4.8kbps, but its computational complexity is prohibitive for real time applications. In this paper, we propose a method to reduce the computations of the VSELP codebook search procedure. The proposed method reduces the search space efficiently, before applying every linear combination of the basis vectors to the codebook search procedure. It decides whether is can fix the combination coefficient of each basis vector using heuristics so that the number of combinations decreases. It has been shown that the proposed method retains good quality of synthetic speech and reduces the computations of codebook search procedure by more than 40% of the origin.

  • PDF