• 제목/요약/키워드: Codebook methods

검색결과 55건 처리시간 0.021초

DHMM 음성 인식 시스템을 위한 양자화 기반의 화자 정규화 (Quantization Based Speaker Normalization for DHMM Speech Recognition System)

  • 신옥근
    • 한국음향학회지
    • /
    • 제22권4호
    • /
    • pp.299-307
    • /
    • 2003
  • 화자독립 음성인식기에서 화자사이의 성도 길이의 영향을 최소화시켜 인식 성능을 개선하는 화자 정규화에 대한 많은 연구가 있어 왔다. 본 연구에서는 벡터양자화기를 이용하여 화자 검증이 가능하다는 사실에 착안하여 벡터 양자화기를 이용한 비교적 간단한 선형 워핑 화자정규화방법을 제안한다. 제안하는 방법에서는 먼저 정규화에 이용될 최적의 코드북을 생성한 다음, 이 코드 북을 이용하여 화자의 선형 워핑계수를 추출하고 추출된 워핑계수는 멜 켑스트럼 추출시에 사용되는 멜스케일 필터뱅크를 워핑하기 위해 이용된다. 본고에서 제안한 워핑계수 추출 및 적용 방법의 성능을 확인하기 위해 이산 HMM을 이용한 13가지의 단음절 한글 숫자음 인식기를 이용하여 인식실험을 수행하였으며, 실험 결과 약 29%의 오인식률 감소를 보여 제안하는 화자 정규화방법이 다른 라인서치 워핑계수추출 방법보다 간단한 동시에 효용가치가 있음을 확인하였다.

자유 시점 TV에서 시점 합성을 위한 시공간적 배경 정보 추정 기반 홀 채움 방식 (Hole-filling Algorithm Based on Extrapolating Spatial-Temporal Background Information for View Synthesis in Free Viewpoint Television)

  • 김범수;응웬 띠엔 닷;홍민철
    • 전기전자학회논문지
    • /
    • 제20권1호
    • /
    • pp.31-44
    • /
    • 2016
  • 본 논문에서는 자유 시점 텔레비전에서 시점 합성 영상 획득을 위해 시공간적 배경 정보 추정 기반 홀 채움 방식을 제안한다. 신뢰할 수 있는 시간적 배경 정보를 획득하기 위해 새로운 배경 코드북의 구성 및 갱신하는 과정을 수행한다. 더불어, 공간적인 국부 배경 정보 추정을 위해 홀 영역의 배경 및 전경 영역의 구별 및 갱신 과정을 수행한다. 추정된 시공간 배경 정보를 조합하여 홀 채움 과정을 수행하고, 잔여 홀 채움을 수행하기 위해 깊이 배경 정보를 이용한 우선순위 함수를 결정하여 표본 기반 인페인팅 기법을 적용한다. 실험 결과를 통해 제안 방식은 기존방식들과 비교하여 평균 0.3~0.6dB의 성능 향상이 있음을 확인하였으며, 동영상 특성 및 홀 형태에 관계없이 제안된 방식이 새로운 시점 영상을 효과적으로 합성할 수 있음을 확인할 수 있었다.

SVM을 이용한 LVQ3 학습의 성능개선 (An Improvement of LVQ3 Learning Using SVM)

  • 김상운
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2001년도 하계종합학술대회 논문집(3)
    • /
    • pp.9-12
    • /
    • 2001
  • Learning vector quantization (LVQ) is a supervised learning technique that uses class information to move the vector quantizer slightly, so as to improve the quality of the classifier decision regions. In this paper we propose a selection method of initial codebook vectors for a teaming vector quantization (LVQ3) using support vector machines (SVM). The method is experimented with artificial and real design data sets and compared with conventional methods of the condensed nearest neighbor (CNN) and its modifications (mCNN). From the experiments, it is discovered that the proposed method produces higher performance than the conventional ones and then it could be used efficiently for designing nonparametric classifiers.

  • PDF

고음질을 갖는 음색변경에 관한 연구 (A Study on the Voice Conversion Algorithm with High Quality)

  • 박형빈;배명진
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2000년도 제13회 신호처리 합동 학술대회 논문집
    • /
    • pp.157-160
    • /
    • 2000
  • In the generally a voice conversion has used VQ(Vector Quantization) for partitioning the spectral feature and has performed by adding an appropriate offset vector to the source speaker's spectral vector. But there is not represented the target speaker's various characteristics because of discrete characteristics of transformed parameter. In this paper, these problems are solved by using the LMR(Linear Multivariate Regression) instead of the mapping codebook which is determined to the relationship of source and target speaker vocal tract characteristics. Also we propose the method for solved the discontinuity which is caused by applying to time aligned parameters using Dynamic Time Warping the time or pitch-scale modified speech. In our proposed algorithm for overcoming the transitional discontinuities, first of all, we don't change time or pitch scale and by using the LMR change a speaker's vocal tract characteristics in speech with non-modified time or pitch. Compared to existed methods based on VQ and LMR, we have much better voice quality in the result of the proposed algorithm.

  • PDF

Speech Recognition using MSHMM based on Fuzzy Concept

  • Ann, Tae-Ock
    • The Journal of the Acoustical Society of Korea
    • /
    • 제16권2E호
    • /
    • pp.55-61
    • /
    • 1997
  • This paper proposes a MSHMM(Multi-Section Hidden Markov Model) recognition method based on Fuzzy Concept, as a method on the speech recognition of speaker-independent. In this recognition method, training data are divided into several section and multi-observation sequences given proper probabilities by fuzzy rule according to order of short distance from MSVQ codebook per each section are obtained. Thereafter, the HMM per each section using this multi-observation sequences is generated, and in case of recognition, a word that has the most highest probability is selected as a recognized word. In this paper, other experiments to compare with the results of these experiments are implemented by the various conventional recognition methods(DP, MSVQ, DMS, general HMM) under the same data. Through results of all-round experiment, it is proved that the proposed MSHMM based on fuzzy concept is superior to DP method, MSVQ method, DMS model and general HMM model in recognition rate and computational time, and does not decreases recognition rate as 92.91% in spite of increment of speaker number.

  • PDF

모음 검출을 통한 텍스트 독립 화자인식에 관한 연구 (A Study on the Text-Independent Speaker Recognition from the Vowel Extraction)

  • 김에녹;복혁규;김형래
    • 전자공학회논문지B
    • /
    • 제31B권10호
    • /
    • pp.82-91
    • /
    • 1994
  • In this thesis, we perform the experiment of speaker recognition by identifying vowels in the pronounciation of each speaker. In detail, we extract the vowels from the pronounciation of each speaker first. From it, we check the frequency energgy of 29 channels. After changing these into fuzzy values, we employ the fuzzy inference to recognize the speaker by text-dependent and text-independent methods. For this experiment, an algorithm of extracting vowels is developed, and newly introduced parameter is the frequency energy of the 29 channels computed from the extracted vowels. It shows the features of each speakers better than existing parameters. The advanced point of this paramter is to use the reference pattern only without the help of any codebook. As a rewult, test-dependent method showed about 95.5% rate of recognition, and text-independent method showed about 94.2% rate of recognition.

  • PDF

다중 관측열을 토대로한 HMM에 의한 음성 인식에 관한 연구 (A study on the speech recognition by HMM based on multi-observation sequence)

  • 정의봉
    • 전자공학회논문지S
    • /
    • 제34S권4호
    • /
    • pp.57-65
    • /
    • 1997
  • The purpose of this paper is to propose the HMM (hidden markov model) based on multi-observation sequence for the isolated word recognition. The proosed model generates the codebook of MSVQ by dividing each word into several sections followed by dividing training data into several sections. Then, we are to obtain the sequential value of multi-observation per each section by weighting the vectors of distance form lower values to higher ones. Thereafter, this the sequential with high probability value while in recognition. 146 DDD area names are selected as the vocabularies for the target recognition, and 10LPC cepstrum coefficients are used as the feature parameters. Besides the speech recognition experiments by way of the proposed model, for the comparison with it, the experiments by DP, MSVQ, and genral HMM are made with the same data under the same condition. The experiment results have shown that HMM based on multi-observation sequence proposed in this paper is proved superior to any other methods such as the ones using DP, MSVQ and general HMM models in recognition rate and time.

  • PDF

Encoding of Speech Spectral Parameters Using Adaptive Vector-Scalar Quantization Methods for Mobile Communication Systems

  • Lee, In-Sung;Kim, Jong-Hark
    • The Journal of the Acoustical Society of Korea
    • /
    • 제17권4E호
    • /
    • pp.35-40
    • /
    • 1998
  • In this paper, an efficient quantization method of line spectrum pairs(LSP) with cascaded structure of vector quantizer and scalar quantizer is proposed. First, input LSP parameters is vector-quantized using a codebook a with a moderate number of entries. In the second stage of quantization, the components of residual vector are individually quantized by the scalar quantizer. The utilization of ordering property of LSP parameters and the inclusion of interframe prediction improve the quantizer performance and remove the stability check routine after quantization procedure. The new vector-scalar hybrid quantizer using 26 bits/frame shows a transparent quality of speech that an average spectral distortion is 1 dB and the frame proportion with above 2 dB spectral distortion is less than 2%. The performances of proposed quantization method is evaluated in the transmission errors.

  • PDF

레일리 감쇄 채널에서 최적 직교 진폭 변조 신호 전송을 고려한 벡터 양자화 (Vector Quantization for Optimum Quadrature Amplitude Modulated Signals in Rayleigh Fading Channel)

  • 배진수;한종기;박애경
    • 한국통신학회논문지
    • /
    • 제27권6B호
    • /
    • pp.610-615
    • /
    • 2002
  • 이 논문에서는 변조된 신호 공간에서 신호 벡터에 적당한 부호를 부여하는 벡터 양자화기를 쓰는 레일리 감쇄채널을 위한 통신 시스템을 다루었다. 변조된 신호공간을 효율적으로 분할함으로써, 변조 신호 파형의 왜곡이 최소화되도록 벡터 양자화 부호화 시스템을 최적화하였다. 모의실험을 통해 최적화된 직교 진폭 변조기가 전체 통신시스템의 성능을 더 좋게함을 보였다.

Limited Feedback Designs for Two-Way Relaying Systems with Physical Network Coding

  • Kim, Young-Tae;Lee, Kwangwon;Jeon, Youngil;Lee, Inkyu
    • Journal of Communications and Networks
    • /
    • 제17권5호
    • /
    • pp.463-472
    • /
    • 2015
  • This paper considers a limited feedback system for two-way wireless relaying channels with physical network coding (PNC). For full feedback systems, the optimal structure with the PNC has already been studied where a modulo operation is employed. In this case, phase and power of two end node channels are adjusted to maximize the minimum distance. Based on this result, we design new quantization methods for the phase and the power in the limited feedback system. By investigating the minimum distance of the received constellation, we present a code-book design to maximize the worst minimum distance. Especially, for quantization of the power for 16-QAM, a new power quantization scheme is proposed to maximize the performance. Also, utilizing the characteristics of the minimum distance observed in our codebook design, we present a power allocation method which does not require any feedback information. Simulation results confirm that our proposed scheme outperforms conventional systems with reduced complexity.