• Title/Summary/Keyword: Vector Quantizer

Search Result 103, Processing Time 0.027 seconds

Image Compression Using DCT Map FSVQ and Single - side Distribution Huffman Tree (DCT 맵 FSVQ와 단방향 분포 허프만 트리를 이용한 영상 압축)

  • Cho, Seong-Hwan
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.10
    • /
    • pp.2615-2628
    • /
    • 1997
  • In this paper, a new codebook design algorithm is proposed. It uses a DCT map based on two-dimensional discrete cosine of transform (2D DCT) and finite state vector quantizer (FSVQ) when the vector quantizer is designed for image transmission. We make the map by dividing input image according to edge quantity, then by the map, the significant features of training image are extracted by using the 2D DCT. A master codebook of FSVQ is generated by partitioning the training set using binary tree based on tree-structure. The state codebook is constructed from the master codebook, and then the index of input image is searched at not master codebook but state codebook. And, because the coding of index is important part for high speed digital transmission, it converts fixed length codes to variable length codes in terms of entropy coding rule. The huffman coding assigns transmission codes to codes of codebook. This paper proposes single-side growing huffman tree to speed up huffman code generation process of huffman tree. Compared with the pairwise nearest neighbor (PNN) and classified VQ (CVQ) algorithm, about Einstein and Bridge image, the new algorithm shows better picture quality with 2.04 dB and 2.48 dB differences as to PNN, 1.75 dB and 0.99 dB differences as to CVQ respectively.

  • PDF

Speaker-Adaptive Speech Synthesis based on Fuzzy Vector Quantizer Mapping and Neural Networks (퍼지 벡터 양자화기 사상화와 신경망에 의한 화자적응 음성합성)

  • Lee, Jin-Yi;Lee, Gwang-Hyeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.1
    • /
    • pp.149-160
    • /
    • 1997
  • This paper is concerned with the problem of speaker-adaptive speech synthes is method using a mapped codebook designed by fuzzy mapping on FLVQ (Fuzzy Learning Vector Quantization). The FLVQ is used to design both input and reference speaker's codebook. This algorithm is incorporated fuzzy membership function into the LVQ(learning vector quantization) networks. Unlike the LVQ algorithm, this algorithm minimizes the network output errors which are the differences of clas s membership target and actual membership values, and results to minimize the distances between training patterns and competing neurons. Speaker Adaptation in speech synthesis is performed as follow;input speaker's codebook is mapped a reference speaker's codebook in fuzzy concepts. The Fuzzy VQ mapping replaces a codevector preserving its fuzzy membership function. The codevector correspondence histogram is obtained by accumulating the vector correspondence along the DTW optimal path. We use the Fuzzy VQ mapping to design a mapped codebook. The mapped codebook is defined as a linear combination of reference speaker's vectors using each fuzzy histogram as a weighting function with membership values. In adaptive-speech synthesis stage, input speech is fuzzy vector-quantized by the mapped codcbook, and then FCM arithmetic is used to synthesize speech adapted to input speaker. The speaker adaption experiments are carried out using speech of males in their thirties as input speaker's speech, and a female in her twenties as reference speaker's speech. Speeches used in experiments are sentences /anyoung hasim nika/ and /good morning/. As a results of experiments, we obtained a synthesized speech adapted to input speaker.

  • PDF

Iterative LBG Clustering for SIMO Channel Identification

  • Daneshgaran, Fred;Laddomada, Massimiliano
    • Journal of Communications and Networks
    • /
    • v.5 no.2
    • /
    • pp.157-166
    • /
    • 2003
  • This paper deals with the problem of channel identification for Single Input Multiple Output (SIMO) slow fading channels using clustering algorithms. Due to the intrinsic memory of the discrete-time model of the channel, over short observation periods, the received data vectors of the SIMO model are spread in clusters because of the AWGN noise. Each cluster is practically centered around the ideal channel output labels without noise and the noisy received vectors are distributed according to a multivariate Gaussian distribution. Starting from the Markov SIMO channel model, simultaneous maximum ikelihood estimation of the input vector and the channel coefficients reduce to one of obtaining the values of this pair that minimizes the sum of the Euclidean norms between the received and the estimated output vectors. Viterbi algorithm can be used for this purpose provided the trellis diagram of the Markov model can be labeled with the noiseless channel outputs. The problem of identification of the ideal channel outputs, which is the focus of this paper, is then equivalent to designing a Vector Quantizer (VQ) from a training set corresponding to the observed noisy channel outputs. The Linde-Buzo-Gray (LBG)-type clustering algorithms [1] could be used to obtain the noiseless channel output labels from the noisy received vectors. One problem with the use of such algorithms for blind time-varying channel identification is the codebook initialization. This paper looks at two critical issues with regards to the use of VQ for channel identification. The first has to deal with the applicability of this technique in general; we present theoretical results for the conditions under which the technique may be applicable. The second aims at overcoming the codebook initialization problem by proposing a novel approach which attempts to make the first phase of the channel estimation faster than the classical codebook initialization methods. Sample simulation results are provided confirming the effectiveness of the proposed initialization technique.

Call Admission Control in ATM by Neural Networks and Fuzzy Pattern Estimator (신경망과 퍼지 패턴 추정기를 이용한 ATM의 호 수락 제어)

  • Lee, Jin-Lee
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.8
    • /
    • pp.2188-2195
    • /
    • 1999
  • This paper proposes a new call admission control scheme utilizing an inverse fuzzy vector quantizer(IFVQ) and neuralnet, which combines benefits of IFVQ and flexibilities of FCM(Fuzzy-C-Means) arithmetics, to decide whether a requested call not to be trained in learning phase to be connected or not. The system generates the estimated traffic pattern for the cell stream of a new call, using feasible/infeasible patterns in codebook, fuzzy membership values that represent the degree to which each pattern of codebook matches input pattern, and FCM arithmetics. The input to the NN is the vector consisted of traffic parameters which are the means and variances of the number of cells arriving in decision as to whether to accept or reject a new call depends on whether the NN is used for decision threshold(+0.5). This method is a new technique for call admission control using the membership values as traffic parameter which declared to CAC at the call set up stage, and this is valid for a very general traffic model in which the calls of a stream can belong to an unlimited number of traffic classes. Through the simulations, it is founded the performance of the suggested method outperforms compared to the conventional NN method.

  • PDF

Image Compression with Edge Directions based on DCT-VQ (DCT-VQ를 기반으로 한 에지의 방향성을 갖는 영상압축)

  • 김진태;김동욱;임한규
    • Journal of Korea Multimedia Society
    • /
    • v.1 no.2
    • /
    • pp.194-203
    • /
    • 1998
  • In this paper, a new DCT-VQ method is proposed which can solve the problems of VQ such as the degradation of edge and enormous calculations. VQ is carried in DCT domain but spatial domain in order to protect the degradation of edge. DCT makes high correlated image data decorrelated and the energy concentrated on a few coefficients. In DCT domain, the DC coefficient is quantized with 8 bits uniform scalar quantizer and the AC coefficients are divided to three regions and coded with vector qiantizer for considering edge components. For the decrease of the calculation and memory, the vectors for three region have small dimension of $1{\times}7$ and use the same codebook. Thus, the proposed method can fully express the edge components by considering AC coefficients in DCT domain and decrease the calculation and memory be reducing the dimension of vectors.

  • PDF

Speaker Adaptation Using Linear Transformation Network in Speech Recognition (선형 변환망을 이용한 화자적응 음성인식)

  • 이기희
    • Journal of the Korea Society of Computer and Information
    • /
    • v.5 no.2
    • /
    • pp.90-97
    • /
    • 2000
  • This paper describes an speaker-adaptive speech recognition system which make a reliable recognition of speech signal for new speakers. In the Proposed method, an speech spectrum of new speaker is adapted to the reference speech spectrum by using Parameters of a 1st linear transformation network at the front of phoneme classification neural network. And the recognition system is based on semicontinuous HMM(hidden markov model) which use the multilayer perceptron as a fuzzy vector quantizer. The experiments on the isolated word recognition are performed to show the recognition rate of the recognition system. In the case of speaker adaptation recognition, the recognition rate show significant improvement for the unadapted recognition system.

  • PDF

Hangul Recognition Using a Hierarchical Neural Network (계층구조 신경망을 이용한 한글 인식)

  • 최동혁;류성원;강현철;박규태
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.11
    • /
    • pp.852-858
    • /
    • 1991
  • An adaptive hierarchical classifier(AHCL) for Korean character recognition using a neural net is designed. This classifier has two neural nets: USACL (Unsupervised Adaptive Classifier) and SACL (Supervised Adaptive Classifier). USACL has the input layer and the output layer. The input layer and the output layer are fully connected. The nodes in the output layer are generated by the unsupervised and nearest neighbor learning rule during learning. SACL has the input layer, the hidden layer and the output layer. The input layer and the hidden layer arefully connected, and the hidden layer and the output layer are partially connected. The nodes in the SACL are generated by the supervised and nearest neighbor learning rule during learning. USACL has pre-attentive effect, which perform partial search instead of full search during SACL classification to enhance processing speed. The input of USACL and SACL is a directional edge feature with a directional receptive field. In order to test the performance of the AHCL, various multi-font printed Hangul characters are used in learning and testing, and its processing its speed and and classification rate are compared with the conventional LVQ(Learning Vector Quantizer) which has the nearest neighbor learning rule.

  • PDF

On-line Vector Quantizer Design Using Simulated Annealing Method (Simulated Annealing 방법을 이용한 온라인 벡터 양자화기 설계)

  • Song, Geun-Bae;Lee, Haeng-Se
    • The KIPS Transactions:PartB
    • /
    • v.8B no.4
    • /
    • pp.343-350
    • /
    • 2001
  • 백터 양자화기 설계는 다차원의 목적함수를 최소화하는 학습 알고리즘을 필요로 한다. 일반화된 Lloyd 방법(GLA)은 벡터 양자화기 설계를 위해 오늘날 가장 널리 사용되는 알고리즘이다. GLA 는 일괄처리(batch) 방식으로 코드북을 생성하며 목적함수를 단조 감소시키는 강하법(descent algorithm)의 일종이다. 한편 Kohonen 학습법(KLA)은 학습벡터가 입력되는 동안 코드북이 갱신되는 온라인 벡터 양자화기 설계 알고리즘 이다. KLA는 원래 신경망 학습을 위해 Kohonen에 의해 제안되었다. KLA 역시 GLA와 마찬가지로 강하법의 일종이라 할 수 있다. 따라서 이들 두 알고리즘은, 비록 사용하기 편리하고 안정적으로 동작을 하지만, 극소(local minimum) 점으로 수렴하는 문제를 안고 있다. 우리는 이 문제와 관련하여 simulated annealing(SA) 방법의 응용을 논하고자 한다. SA는 현재까지 극소에 빠지지 않고 최소(global minimum)로 수렴하면서, 해의 수렴이 (통계적으로) 보장되는 유일한 방법이라 할 수 있다. 우리는 먼저 GLA에 SA를 응용한 그 동안의 연구를 개괄한다. 다음으로 온라인 방식의 벡터 양자화가 설계에 SA 방법을 응용함으로써 SA 방법에 기초한 새로운 온라인 학습 알고리즘을 제안한다. 우리는 이 알고리즘을 OLVQ-SA 알고리즘이라 부르기로 한다. 가우스-마코프 소스와 음성데이터에 대한 벡터양자화 실험 결과 제안된 방법이 KLA 보다 일관되게 우수한 코드북을 생성함을 보인다.

  • PDF

The Variable Block-based Image Compression Technique using Wavelet Transform (웨이블릿 변환을 이용한 가변블록 기반 영상 압축)

  • 권세안;장우영;송광훈
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.7B
    • /
    • pp.1378-1383
    • /
    • 1999
  • In this paper, an effective variable-block-based image compression technique using wavelet transform is proposed. Since the statistical property of each wavelet subband is different, we apply the adaptive quantization to each wavelet subband. In the proposed algorithm, each subband is divided into non-overlapping variable-sized blocks based on directional properties. In addition, we remove wavelet coefficients which are below a certain threshold value for coding efficiency. To compress the transformed data, the proposed algorithm quantizes the wavelet coefficients using scalar quantizer in LL subband and vector quantizers for other subbands to increase compression ratio. The proposed algorithm shows improvements in compression ratio as well as PSNR compared with the existing block-based compression algorithms. In addition, it does not cause any blocking artifacts in very low bit rates even though it is also a block-based method. The proposed algorithm also has advantage in computational complexity over the existing wavelet-based compression algorithms since it is a block-based algorithm.

  • PDF

A New Fast Training Algorithm for Vector Quantizer Design (벡터양자화기의 코드북을 구하는 새로운 고속 학습 알고리듬)

  • Lee, Dae-Ryong;Baek, Seong-Joon;Sung, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.5
    • /
    • pp.107-112
    • /
    • 1996
  • In this paper we propose a new fast codebook training algorithm for reducing the searching time of LBG algorithm. For each training data, the proposed algorithm stores the indexes of codewords that are close to that training data in the first iteration. It reduces computation time by searching only those codewords, the indexes of which are stored for each training data. Compared to one of the previous fast training algorithm, FSLBG, it obtains a better codebook with less exccution time. In our experiment, the performance of the codebook generated by the proposed algorithm in terms of peak signal-to-noise ratio(TSNR) is very close to that of LBG algorithm. However, the codewords to be searched for each training data of the proposed algorithm is only about 6%, for a codebook size of 256 and 1.6%, for a codebook size of 1.24, of LBG algorithm.

  • PDF