Search | Korea Science

HMM-based Speech Recognition using FSVQ, Fuzzy Concept and Doubly Spectral Feature (FSVQ, 퍼지 개념 및 이중 스펙트럼 특징을 이용한 HMM에 기초를 둔 음성 인식)

정의봉
- Journal of the Korea Computer Industry Society
- /
- v.5 no.4
- /
- pp.491-502
- /
- 2004
In this paper, we propose a HMM model using FSVQ(First Section VQ), fuzzy theory and doubly spectral feature, as study on the isolated word recognition system of speaker-independent. In the proposed paper, LPC cepstrum coefficients and regression coefficients of LPC cepstrum as doubly spectral feature be used. And, training data are divided several section and first section is generated codebook of VQ, and then is obtained multi-observation sequences by order of large propabilistic values based on fuzzy nile from the codebook of the first section. Thereafter, this observation sequences of first section is trained and is recognized a word to be obtained highest probaility by same concept. Besides the speech recognition experiments of proposed method, we experiment the other methods under the equivalent environment of data and conditions. In the whole experiment, it is proved that the proposed method is superior to the others in recognition rate.
PDF

HMM-based Speech Recognition using DMS Model and Fuzzy Concept (DMS 모델과 퍼지 개념을 이용한 HMM에 기초를 둔 음성 인식)

Ann, Tae-Ock
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.9 no.4
- /
- pp.964-969
- /
- 2008
This paper proposes a HMM-based recognition method using DMSVQ(Dynamic Multi-Section Vector Quantization) codebook by DMS(Dynamic Multi-Section) model and fuzzy concept, as a study for speaker- independent speech recognition. In this proposed recognition method, training data are divided into several dynamic section and multi-observation sequences which are given proper probabilities by fuzzy rule according to order of short distance from DMSVQ codebook per each section are obtained. Thereafter, the HMM using this multi-observation sequences is generated, and in case of recognition, a word that has the most highest probability is selected as a recognized word. Other experiments to compare with the results of recognition experiments using proposed method are implemented as a data by the various conventional recognition methods under the equivalent environment. Through the experiment results, it is proved that the proposed method in this study is superior to the conventional recognition methods.
https://doi.org/10.5762/KAIS.2008.9.4.964 인용 PDF

Iterative LBG Clustering for SIMO Channel Identification

Daneshgaran, Fred;Laddomada, Massimiliano
- Journal of Communications and Networks
- /
- v.5 no.2
- /
- pp.157-166
- /
- 2003
This paper deals with the problem of channel identification for Single Input Multiple Output (SIMO) slow fading channels using clustering algorithms. Due to the intrinsic memory of the discrete-time model of the channel, over short observation periods, the received data vectors of the SIMO model are spread in clusters because of the AWGN noise. Each cluster is practically centered around the ideal channel output labels without noise and the noisy received vectors are distributed according to a multivariate Gaussian distribution. Starting from the Markov SIMO channel model, simultaneous maximum ikelihood estimation of the input vector and the channel coefficients reduce to one of obtaining the values of this pair that minimizes the sum of the Euclidean norms between the received and the estimated output vectors. Viterbi algorithm can be used for this purpose provided the trellis diagram of the Markov model can be labeled with the noiseless channel outputs. The problem of identification of the ideal channel outputs, which is the focus of this paper, is then equivalent to designing a Vector Quantizer (VQ) from a training set corresponding to the observed noisy channel outputs. The Linde-Buzo-Gray (LBG)-type clustering algorithms [1] could be used to obtain the noiseless channel output labels from the noisy received vectors. One problem with the use of such algorithms for blind time-varying channel identification is the codebook initialization. This paper looks at two critical issues with regards to the use of VQ for channel identification. The first has to deal with the applicability of this technique in general; we present theoretical results for the conditions under which the technique may be applicable. The second aims at overcoming the codebook initialization problem by proposing a novel approach which attempts to make the first phase of the channel estimation faster than the classical codebook initialization methods. Sample simulation results are provided confirming the effectiveness of the proposed initialization technique.
PDF KSCI

Improvement of performance for the LBG algorithm by the decision of initial codevectors (초기 코드백터 결정에 의한 LBG 알고리즘의 성능 개선)

Hong, Chi-Hwun;Ch0, Che-Hwang
- The Journal of the Acoustical Society of Korea
- /
- v.14 no.2
- /
- pp.16-29
- /
- 1995
Choosing initial codevectors in the LBG algorithm controls the performance of a codebook, because it only guarantees a locally optimal codebook. In this paper, we propose the decision method of initial codevectors by a decision radius which takes for feature vectors DC, low frequency, medium frequency and high frequency terms generated by a DCT. The more the decision radius is increased in order to decide initial codevectors, the more the number of membership vectors and the standard deviation for distance among the initial codevectors are increased. To obtain improved performance for a codebook, the decision radius for DC term is required above 0.9 of the membership rate and those for low frequency, medium frequency and high frequency terms under 0.6 of it.
PDF

A Design of a Robust Vector Quantizer for Wavelet Transformed Images (웨이브렛벤환 영상 부호화용 범용 벡터양자화기의 설계)

Do, Jae-Su;Cho, Young-Suk
- Convergence Security Journal
- /
- v.6 no.4
- /
- pp.83-90
- /
- 2006
In this paper, we propose a new design method for a robust vector quantizer that is independent of the statistical characteristics of input images in the wavelet transformed image coding. The conventional vector quantizers have failed to get quality coding results because of the different statistical properties between the image to be quantized and the training sequence for a codebook of the vector quantizer. Therefore, in order to solve this problem, we used a pseudo image as a training sequence to generate a codebook of the vector quantizer; the pseudo image is created by adding correlation coefficient and edge components to uniformly distributed random numbers. We will clearly define the problem of the conventional vector quantizers, which use real images as a training sequence to generate a codebook used, by comparing the conventional methods with the proposed through computer simulation. Also, we will show the proposed vector quantizer yields better coding results.
PDF

HMM-based Speech Recognition using FSVQ and Fuzzy Concept (FSVQ와 퍼지 개념을 이용한 HMM에 기초를 둔 음성 인식)

안태옥
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.40 no.6
- /
- pp.90-97
- /
- 2003
This paper proposes a speech recognition based on HMM(Hidden Markov Model) using FSVQ(First Section Vector Quantization) and fuzzy concept. In the proposed paper, we generate codebook of First Section, and then obtain multi-observation sequences by order of large propabilistic values based on fuzzy rule from the codebook of the first section. Thereafter, this observation sequences of first section from codebooks is trained and in case of recognition, a word that has the most highest probability of first section is selected as a recognized word by same concept. Train station names are selected as the target recognition vocabulary and LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments of proposed method, we experiment the other methods under same conditions and data. Through the experiment results, it is proved that the proposed method based on HMM using FSVQ and fuzzy concept is superior to tile others in recognition rate.
PDF KSCI

VQ Codebook Design and Feature Extraction of Image Information for Multimedia Information Searching (멀티미디어 정보검색에 적합한 영상정보의 벡터 양자화 코드북 설계 및 특징추출)

Seo, Seok-Bae;Kim, Dae-Jin;Kang, Dae-Seong
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.36S no.8
- /
- pp.101-112
- /
- 1999
In this paper, the codebook design method of VQ (vector quantization) is proposed an method to extract feature data of image for multimedia information searching. Conventional VQ codebook design methods are unsuitable to extract the feature data of images because they have too much computation time, memory for vector decoding and blocking effects like DCT (discrete cosine transform). The proposed design method is consists of the feature extraction by WT (wavelet transform) and the data group divide method by PCA (principal component analysis). WT is introduced to remove the blocking effect of an image with high compressing ratio. Computer simulations show that the proposed method has the better performance in processing speed than the VQ design method using SOM (self-organizing map).
PDF

Determination and Performance Evaluation of Codevectors Utilizing Phase Difference Distribution Characteristics of Circular Antenna Arrays (원형 안테나 배열의 위상 차이 분포 특성을 활용한 코드벡터 결정 방식 및 성능 평가)

Kim, Huiwon;Suh, Junyeub;Sung, Wonjin
- Journal of the Institute of Electronics and Information Engineers
- /
- v.53 no.10
- /
- pp.3-9
- /
- 2016
Current mobile communication systems utilize the multiple-input multiple-output (MIMO) transmission technique as an important means to enhance the bandwidth efficiency. Accurate beamforming via channel estimation contributes to the signal-to-interference-plus-noise ratio (SINR) increase and the system performance improvement when MIMO transmission techniques are employed. Therefore, determination of beamforming vectors as well as the design of appropriate codebooks defining these codevectors play an important role in system operation. In this paper, we statistically analyze the phase difference between the channels corresponding to adjacent antenna elements in order to design an efficient codebook for uniform circular arrays (UCAs). We introduce new parameters which compensate for the additional phase difference observed in its probability density functions (PDFs). The performance of the proposed codebook is tested using the spatial channel model (SCM) to demonstrate its gain over the standard codebooks adopted in the long term evolution (LTE) Releases 8 and 10.
https://doi.org/10.5573/ieie.2016.53.10.03 인용 PDF KSCI

Design of a Variable Bit Rate Speech Coder Based on One-dimensional SPIHT (1차원 SPIHT를 이용한 가변 비트율 음성 부호기의 설계)

Na, Hoon;Jeong, Dae-Gwon
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.6
- /
- pp.443-451
- /
- 2003
Since a codebook-based CELP coder models its excitation signal according to one of several bit rates pre-assigned to codebooks and synthesizes speech signal using codebooks, it can not support encoding of speech signal at an arbitrary bit rate in one encoder. The proposed variable bit rate speech coder encodes the excitation signal based on the bit rate assigned to a present frame of speech using one-dimensional SPIHT and wavelet transform. Also it does't need to model excitation signal (or codebook) to some types as CELP coder, and can encode excitation signal at various bit rates without exact pitch information according to user requirement. As a result, since the coder doesn't have a codebook structure, it has relatively low coder complexity and provides equal or better speech quality compared to G.729 and G.723.1 coder.
PDF KSCI

Speaker-adaptive Word Recognition Using Mapped Membership Function (사상멤버쉽함수에 의한 화자적응 단어인식)

Lee, Ki-Yeong;Choi, Kap-Seok
- The Journal of the Acoustical Society of Korea
- /
- v.11 no.3
- /
- pp.40-52
- /
- 1992
In this paper, we propose the speaker adaptive word recognition method using a mapped membership function, in order to absorb a fluctuation owing to personal difference which is a problem of speaker independent speech recognition. In the training procedure of this method, the mapped membership function is made with the fuzzy theory introducded into a mapped codebook, between an unknown speaker's spectrum pattern and a standard speaker's one. In the recognition procedure, an input pattern of an unknown speaker is reconstructed to the pattern which is adapted to that of a standard speaker by the mapped membership function. To show the validity of this method, word recognition experiments are carried out using 28 DDD area names. The recognition rate of the conventional speaker-adaptive method using a mapped codebook by VQ is 64.9[%], and that made by a fuzzy VQ is 76.2[%]. Throughout the experiment using a mapped membership function, we can achieve 95.4[%] recognition rate. This shows that our proposed method is more excellent in recognition performance. Moreover, this method doesn't need an iterative training procedure to make the mapped membership function, and memory capacity and computation requirements for this method are reduced to 1/30 and 1/500 time of those for the conventional method using a mapped codebook, respectively.
PDF

Search Result 346, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)