• Title/Summary/Keyword: codebook

Search Result 346, Processing Time 0.025 seconds

The Research of Reducing the Fixed Codebook Search Time of G.723.1 MP-MLQ (잡음 환경에서의 전송율 감소를 위한 G.723.1 VAD 성능개선에 관한 연구)

  • 김정진;박영호;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2000.06d
    • /
    • pp.98-101
    • /
    • 2000
  • On CELP type Vocoders G.723.1 6.3kbps/5.3kbps Dual Rate Speech Codec, which is developed for Internet Phone and videoconferencing, uses VAD(Voice Activity Detection)/CNG (Comfort Noise Generator) in order to reduce the bit rate in a silence period. In order to reduce the bit rate effectively in this paper, we first set the boundary condition of the energy threshold to prevent the consumption of unnecessary processing time, and use three decision rules to detect an active frame by energy, pitch gain and LSP distance. To evaluate the performance of the proposed algorithm we use silence-inserted speech data with 0, 5, 10, 20dB of SNR. As a result when SNR is over 5dB, the bit rate is reduced up to about 40% without speech degradation and the processing time is additionally decreased.

  • PDF

Tandemless Transcoding for AMR and EVRC Speech Coders (AMR과 EVRC 음성 부호화기간의 비탠덤 방식을 이용한 상호 부호화)

  • 이선일;유창동
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.6
    • /
    • pp.531-542
    • /
    • 2002
  • Novel tandemless transcoding method for AMR and EVRC speech coders is proposed in this paper. In contrast to conventional tandem method, the parameters which is used commonly in speech coder where CELP algorithm is adapted are directly transcoded. The proposed algorithm is composed of LSP transcoding, pitch delay transcoding, gains transcoding and fixed codebook vector transcoding Evaluation results show that the novel algorithm achieves better speech quality than tandem method and reduce computational complexity and delay.

A Study on the Voice Conversion Algorithm with High Quality (고음질을 갖는 음색변경에 관한 연구)

  • 박형빈;배명진
    • Proceedings of the IEEK Conference
    • /
    • 2000.09a
    • /
    • pp.157-160
    • /
    • 2000
  • In the generally a voice conversion has used VQ(Vector Quantization) for partitioning the spectral feature and has performed by adding an appropriate offset vector to the source speaker's spectral vector. But there is not represented the target speaker's various characteristics because of discrete characteristics of transformed parameter. In this paper, these problems are solved by using the LMR(Linear Multivariate Regression) instead of the mapping codebook which is determined to the relationship of source and target speaker vocal tract characteristics. Also we propose the method for solved the discontinuity which is caused by applying to time aligned parameters using Dynamic Time Warping the time or pitch-scale modified speech. In our proposed algorithm for overcoming the transitional discontinuities, first of all, we don't change time or pitch scale and by using the LMR change a speaker's vocal tract characteristics in speech with non-modified time or pitch. Compared to existed methods based on VQ and LMR, we have much better voice quality in the result of the proposed algorithm.

  • PDF

Video coding using multi-resolution image (다중해상도 영상을 이용한 동영상 압축)

  • 배성호;박길흠
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.2
    • /
    • pp.33-42
    • /
    • 1997
  • In this paper, a video coding method in wavelet transformed multi-resolution image using variable block sized motion estimation and multi-codebook is proposed. In the propoed method, the accuracy of motion estimation is increased by using variable block matching algorithm based on edge type of blocks which estimation is increased by using variable block matching algoritm based on edge type of blocks which is classified accoridng to the magnitude of wavelet coefficients in vertical subband and horizontal subband of the highest layer. Also, we increased the flexibility of bit allocation and decreased vector quantization error for motion compensated error transmission by using importance of each subband. Some experimental results confirm that he proposed mothod has fine reconstructed images without blocking effect at low bit rate, and especially reconstructs edges well to which human eyes are sensitive.

  • PDF

Vector Quantization using Genetic Algorithm (유전자 알고리즘을 이용한 벡터 양자화)

  • 임현택
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.197-200
    • /
    • 1998
  • 본 논문에서는 유전자 알고리즘(genetic Algorithm)을 사용하여 벡터 양자화(vector quantization : VQ)를 수행하는 방법을 제안하고자 한다. 벡터 양자화를 수행하여 코드북(codebook)을 생성할 때 생성된 코드북과 학습벡터(training vector)사이에는 반드시 양자화 오차(quantization error)가 발생하는데 기존의 K-means 알고리듬을 사용하여 코드북을 생성했을 경우 양자화 오차를 줄이는데 한계가 있었다. 본 논문에서 제안하는 유전자 알고리즘을 이용한 벡터 양자화는 이 양자화 오차를 감소시키기 위해서 연구되었다. 제안한 방법의 성능을 평가하기 위해 음성데이터를 기존의 K-means 알고리즘에서 클러스터의 중심을 선택하는 방법중의 하나인 Minimax방법으로 코드북을 생성하여 제안한 방법과 양자화 오차를 비교한 결과 양자화 오차가 감소됨을 알 수 있었다.

  • PDF

Wideband Speech Reconstruction Using Modular Neural Networks (모듈화한 신경 회로망을 이용한 광대역 음성 복원)

  • Woo Dong Hun;Ko Charm Han;Kang Hyun Min;Jeong Jin Hee;Kim Yoo Shin;Kim Hyung Soon
    • MALSORI
    • /
    • no.48
    • /
    • pp.93-105
    • /
    • 2003
  • Since telephone channel has bandlimited frequency characteristics, speech signal over the telephone channel shows degraded speech quality. In this paper, we propose an algorithm using neural network to reconstruct wideband speech from its narrowband version. Although single neural network is a good tool for direct mapping, it has difficulty in training for vast and complicated data. To alleviate this problem, we modularize the neural networks based on appropriate clustering of the acoustic space. We also introduce fuzzy computing to compensate for probable misclassification at the cluster boundaries. According to our simulation, the proposed algorithm showed improved performance over the single neural network and conventional codebook mapping method in both objective and subjective evaluations.

  • PDF

A Study on Single Vowels Recognition using VQ and Multi-layer Perceptron (VQ와 Multi-layer perceptron을 이용한 단모음 인식에 관한 연구)

  • 안태옥;이상훈;김순협
    • The Journal of the Acoustical Society of Korea
    • /
    • v.12 no.1
    • /
    • pp.55-60
    • /
    • 1993
  • 본 논문은 불특정 화자의 단모음 인식에 관한 연구로써, VQ(Vectro Quantization)와 MLP(multi-layer perceptron)에 의한 음성 인식 방법을 제안한다. 이 방법은 VQ codebook을 구하고 이를 이용해서 관측열(observation sequence)을 구해각 codeword가 데이터로부터 가질 수 있는 확률값을 계산하여 이 값을 신경 회로망의 입력으로 사용하는 방법이다. 인식 대상으로는 한국어 단모음을 선정하였으며 10명의 남성 화자가 8개의 단모음을 10번씩 발음한 것으로 시스템의 효율성을 알아보기 위해 VQ/HMM(hidden markov model)에 의한 인식과 비교 실험한다. 실험 결과에 의하면, 시스템의 단순성에도 불구하고 학습능력애 뛰어난 관계로 VQ/HMM보다 VQ와 MLP에 의한 음성 인식률이 향상됨을 보여준다.

  • PDF

Speech Recognition using MSHMM based on Fuzzy Concept

  • Ann, Tae-Ock
    • The Journal of the Acoustical Society of Korea
    • /
    • v.16 no.2E
    • /
    • pp.55-61
    • /
    • 1997
  • This paper proposes a MSHMM(Multi-Section Hidden Markov Model) recognition method based on Fuzzy Concept, as a method on the speech recognition of speaker-independent. In this recognition method, training data are divided into several section and multi-observation sequences given proper probabilities by fuzzy rule according to order of short distance from MSVQ codebook per each section are obtained. Thereafter, the HMM per each section using this multi-observation sequences is generated, and in case of recognition, a word that has the most highest probability is selected as a recognized word. In this paper, other experiments to compare with the results of these experiments are implemented by the various conventional recognition methods(DP, MSVQ, DMS, general HMM) under the same data. Through results of all-round experiment, it is proved that the proposed MSHMM based on fuzzy concept is superior to DP method, MSVQ method, DMS model and general HMM model in recognition rate and computational time, and does not decreases recognition rate as 92.91% in spite of increment of speaker number.

  • PDF

Color Image Vector Quantization Using Enhanced SOM Algorithm

  • Kim, Kwang-Baek
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.12
    • /
    • pp.1737-1744
    • /
    • 2004
  • In the compression methods widely used today, the image compression by VQ is the most popular and shows a good data compression ratio. Almost all the methods by VQ use the LBG algorithm that reads the entire image several times and moves code vectors into optimal position in each step. This complexity of algorithm requires considerable amount of time to execute. To overcome this time consuming constraint, we propose an enhanced self-organizing neural network for color images. VQ is an image coding technique that shows high data compression ratio. In this study, we improved the competitive learning method by employing three methods for the generation of codebook. The results demonstrated that compression ratio by the proposed method was improved to a greater degree compared to the SOM in neural networks.

  • PDF

A zeroblock coding algorithm for subband image compression

  • Park, Sahng-Ho
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.11
    • /
    • pp.2375-2380
    • /
    • 1997
  • The need for developing effective coding techniques for various multimedia services is increasing in order to meet the demand for image data. In this paper, a zeroblock coding algorithm is proposed for progressive transmission of images. The zeroblock coding algorithm is constructed as an embedded coding so that the encoding and decoding process can be terminated at any point and allowing reasonable image quality. Some features of zeroblock coding algorithm are 1) coding of subband images by prediction of the insignificance of blocks across subband leels, 2) aset of sate transition rules for representing the significance map of blocks, and 3) block coding by vector quantization using a multiband codebook consisting of several subcodebooks dedicated for each subband at a given threshold.

  • PDF