• Title/Summary/Keyword: 학습벡터 양자화

Search Result 47, Processing Time 0.026 seconds

Fast Competitive Learning with Classified Learning Rates (분류된 학습률을 가진 고속 경쟁 학습)

  • Kim, Chang-Wook;Cho, Seong-Won;Lee, Choong-Woong
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.11
    • /
    • pp.142-150
    • /
    • 1994
  • This paper deals with fast competitive learning using classified learning rates. The basic idea of the proposed method is to assign a classified learning rate to each weight vector. The weight vector associated with an output node is updated using its own learning rate. Each learning rate is changed only when its corresponding output node wins the competition, and the learning rates of the losing nodes are not changed. The experimental results obtained with image vector quantization show that the proposed method learns more rapidly and yields better quality that conventional competitive learning.

  • PDF

Vector Quantization based Speech Recognition Performance Improvement using Maximum Log Likelihood in Gaussian Distribution (가우시안 분포에서 Maximum Log Likelihood를 이용한 벡터 양자화 기반 음성 인식 성능 향상)

  • Chung, Kyungyong;Oh, SangYeob
    • Journal of Digital Convergence
    • /
    • v.16 no.11
    • /
    • pp.335-340
    • /
    • 2018
  • Commercialized speech recognition systems that have an accuracy recognition rates are used a learning model from a type of speaker dependent isolated data. However, it has a problem that shows a decrease in the speech recognition performance according to the quantity of data in noise environments. In this paper, we proposed the vector quantization based speech recognition performance improvement using maximum log likelihood in Gaussian distribution. The proposed method is the best learning model configuration method for increasing the accuracy of speech recognition for similar speech using the vector quantization and Maximum Log Likelihood with speech characteristic extraction method. It is used a method of extracting a speech feature based on the hidden markov model. It can improve the accuracy of inaccurate speech model for speech models been produced at the existing system with the use of the proposed system may constitute a robust model for speech recognition. The proposed method shows the improved recognition accuracy in a speech recognition system.

Speaker Normalization using Gaussian Mixture Model for Speaker Independent Speech Recognition (화자독립 음성인식을 위한 GMM 기반 화자 정규화)

  • Shin, Ok-Keun
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.437-442
    • /
    • 2005
  • For the purpose of speaker normalization in speaker independent speech recognition systems, experiments are conducted on a method based on Gaussian mixture model(GMM). The method, which is an improvement of the previous study based on vector quantizer, consists of modeling the probability distribution of canonical feature vectors by a GMM with an appropriate number of clusters, and of estimating the warp factor of a test speaker by making use of the obtained probabilistic model. The purpose of this study is twofold: improving the existing ML based methods, and comparing the performance of what is called 'soft decision' method with that of the previous study based on vector quantizer. The effectiveness of the proposed method is investigated by recognition experiments on the TIMIT corpus. The experimental results showed that a little improvement could be obtained tv adjusting the number of clusters in GMM appropriately.

Efficient Speaker Identification based on Robust VQ-PCA (강인한 VQ-PCA에 기반한 효율적인 화자 식별)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.3
    • /
    • pp.57-62
    • /
    • 2004
  • In this paper, an efficient speaker identification based on robust vector quantizationprincipal component analysis (VQ-PCA) is proposed to solve the problems from outliers and high dimensionality of training feature vectors in speaker identification, Firstly, the proposed method partitions the data space into several disjoint regions by roust VQ based on M-estimation. Secondly, the robust PCA is obtained from the covariance matrix in each region. Finally, our method obtains the Gaussian Mixture model (GMM) for speaker from the transformed feature vectors with reduced dimension by the robust PCA in each region, Compared to the conventional GMM with diagonal covariance matrix, under the same performance, the proposed method gives faster results with less storage and, moreover, shows robust performance to outliers.

  • PDF

The Image Compression Using the Central Vectors of Clusters (Cluster의 중심벡터를 이용하는 영상 압축)

  • Cho, Che-Hwang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.1
    • /
    • pp.5-12
    • /
    • 1995
  • In the case where the set of training vectors constitute clusters, the codevectors of the codebook which is used to compression for speech and images in the vector quantization are regarded as the central vectors of the clusters constituted by given training vectors. In this work, we consider the distribution of Euclidean distance obtaining in the process of searching for the minimum distance between vectors, and propose the method searching for the proper number of and the central vectors of clusters. And then, the proposed method shows more than the about 4[dB] SNR than the LBG algorithm and the competitive learning algorithm

  • PDF

Entropy-Constrained Temporal Decomposition (엔트로피 제한 조건을 갖는 시간축 분할)

  • Lee Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.5
    • /
    • pp.262-270
    • /
    • 2005
  • In this paper, a new temporal decomposition method is proposed. where not oniy distortion but also entropy are involved in segmentation. The interpolation functions and the target feature vectors are determined by a dynamic Programing technique. where both distortion and entropy are simultaneously minimized. The interpolation functions are built by using a training speech corpus. An iterative method. where segmentation and estimation are iteratively performed. finds the locally optimum Points in the sense of minimizing both distortion and entropy. Simulation results -3how that in terms of both distortion and entropy. the Proposed temporal decomposition method Produced superior results to the conventional split vector-quantization method which is widely employed in the current speech coding methods. According to the results from the subjective listening test, the Proposed method reveals superior Performance in terms of qualify. comparing to the Previous vector quantization method.

On-line Vector Quantizer Design Using Stochastic Relaxation (Stochastic Relaxation 방법을 이용한 온라인 벡터 양자화기 설계)

  • Song, Geun-Bae;Lee, Haing-Sei
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.38 no.5
    • /
    • pp.27-36
    • /
    • 2001
  • This paper proposes new design algorithms based on stochastic relaxation (SR) for an on-line vector quantizer (VQ) design. These proposed SR methods solve the local entrapment problems of the conventional Kohonen learning algorithm (KLA). These SR methods cover two different types depending upon the use of simulated annealing (SA) : the one that uses SA is called the OLVQ SA and the other the OLVQ SR. These methods arc combined with the KLA and therefore preserve the its convergence properties. Experimental results for Gauss Markov sources, real speech and image demonstrate that the proposed algorithms can consistently provide better codebooks than the KLA.

  • PDF

Analysis of the Effect on the Quantization of the Network's Outputs in the Neural Processor by the Implementation of Hybrid VLSI (하이브리드 VLSI 신경망 프로세서에서의 양자화에 따른 영향 분석)

  • Kwon, Oh-Jun;Kim, Seong-Woo;Lee, Jong-Min
    • The KIPS Transactions:PartB
    • /
    • v.9B no.4
    • /
    • pp.429-436
    • /
    • 2002
  • In order to apply the artificial neural network to the practical application, it is needed to implement it with the hardware system. It is most promising to make it with the hybrid VLSI among various possible technologies. When we Implement a trained network into the hybrid neuro-chips, it is to be performed the process of the quantization on its neuron outputs and its weights. Unfortunately this process cause the network's outputs to be distorted from the original trained outputs. In this paper we analysed in detail the statistical characteristics of the distortion. The analysis implies that the network is to be trained using the normalized input patterns and finally into the solution with the small weights to reduce the distortion of the network's outputs. We performed the experiment on an application in the time series prediction area to investigate the effectiveness of the results of the analysis. The experiment showed that the network by our method has more smaller distortion compared with the regular network.

On the Development of a Continuous Speech Recognition System using Continuous Hidden Markov Model for Korean Language (연속분포 HMM을 이용한 한국어 연속 음성 인식 시스템 개발)

  • Kim, Do-Yeong;Park, Yong-Kyu;Kwon, Oh-Wook;Un, Chong-Kwan
    • Annual Conference on Human and Language Technology
    • /
    • 1993.10a
    • /
    • pp.101-110
    • /
    • 1993
  • 본 논문에서는 연속분포 hidden Markov 모델을 이용한 화자독립 연속 음성 인식 시스템에 관해 기술한다. 연속분포 모델은 평균과 분산 벡터로 구성되며 음성신호를 직접 모델링하여 양자화 왜곡이 없어진다. 특징벡터는 filter bank 계수 및 그 1, 2차 미분계수를 사용하여 음성신호의 동적 특성을 반영하였다. Segmental K-means 알고리즘을 이용하여 학습하였으며, 연속어 인식에서 가장 문제가 되는 조음화 현상으로 인한 인식률 저하를 막기 위해 앞뒤의 음소를 고려해 주는 triphone을 인식단위로 사용하였다. Search 알고리즘으로는 시간 면에서 효율이 좋은 one-pass search 알고리즘을 사용하였다. 성능 평가를 위한 화자 독립 인식 실험에서 문법이 없을 경우 83%, finite state network율 적용한 경우에는 94%의 인식률을 나타내었다.

  • PDF

Isolated Korean Digits Recognition Using Stochasitc Transition Models With Phoneme-based VQ Codebooks (음소단위 코드북간의 확률적 전이 모델을 이용한 한국어 숫자음 인식에 관한 연구)

  • Choi, Hwan-Jin;Oh, Yung-Hwan
    • Annual Conference on Human and Language Technology
    • /
    • 1993.10a
    • /
    • pp.149-157
    • /
    • 1993
  • 음성인식을 위해 다양한 방법들이 제안되어 있다. 본 연구에서는 음소단위 각각의 벡터 양자화된 코드북의 색인을 학습하는 HMM을 이용하여 한국어 숫자음을 대상으로 인식 실험을 수행하였다. 실험결과, 기존의 단어단위 HMM과 음소단위로 이루어진 유한상태기계(FSM)구조의 인식기에 비해 높은 인식율을 보였다.

  • PDF