Search | Korea Science

Realization a Text Independent Speaker Identification System with Frame Level Likelihood Normalization (프레임레벨유사도정규화를 적용한 문맥독립화자식별시스템의 구현)

김민정;석수영;김광수;정현열
- Journal of the Institute of Convergence Signal Processing
- /
- v.3 no.1
- /
- pp.8-14
- /
- 2002
In this paper, we realized a real-time text-independent speaker recognition system using gaussian mixture model, and applied frame level likelihood normalization method which shows its effects in verification system. The system has three parts as front-end, training, recognition. In front-end part, cepstral mean normalization and silence removal method were applied to consider speaker's speaking variations. In training, gaussian mixture model was used for speaker's acoustic feature modeling, and maximum likelihood estimation was used for GMM parameter optimization. In recognition, likelihood score was calculated with speaker models and test data at frame level. As test sentences, we used text-independent sentences. ETRI 445 and KLE 452 database were used for training and test, and cepstrum coefficient and regressive coefficient were used as feature parameters. The experiment results show that the frame-level likelihood method's recognition result is higher than conventional method's, independently the number of registered speakers.
PDF

Vocal Tract Normalization Using The Power Spectrum Warping (파워 스펙트럼 warping을 이용한 성도 정규화)

Yu, Il-Su;Kim, Dong-Ju;No, Yong-Wan;Hong, Gwang-Seok
- Proceedings of the KIEE Conference
- /
- 2003.11b
- /
- pp.215-218
- /
- 2003
The method of vocal tract normalization has been known as a successful method for improving the accuracy of speech recognition. A frequency warping procedure based low complexity and maximum likelihood has been generally applied for vocal tract normalization. In this paper, we propose a new power spectrum warping procedure that can be improve on vocal tract normalization performance than a frequency warping procedure. A mechanism for implementing this method can be simply achieved by modifying the power spectrum of filter bank in Mel-frequency cepstrum feature(MFCC) analysis. Experimental study compared our Proposal method with the well-known frequency warping method. The results have shown that the power spectrum warping is better 50% about the recognition performance than the frequency warping.
PDF

A New Power Spectrum Warping Approach to Speaker Warping (화자 정규화를 위한 새로운 파워 스펙트럼 Warping 방법)

유일수;김동주;노용완;홍광석
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.41 no.4
- /
- pp.103-111
- /
- 2004
The method of speaker normalization has been known as the successful method for improving the accuracy of speech recognition at speaker independent speech recognition system. A frequency warping approach is widely used method based on maximum likelihood for speaker normalization. This paper propose a new power spectrum warping approach to making improvement of speaker normalization better than a frequency warping. Th power spectrum warping uses Mel-frequency cepstrum analysis(MFCC) and is a simple mechanism to performing speaker normalization by modifying the power spectrum of Mel filter bank in MFCC. Also, this paper propose the hybrid VTN combined the Power spectrum warping and a frequency warping. Experiment of this paper did a comparative analysis about the recognition performance of the SKKU PBW DB applied each speaker normalization approach on baseline system. The experiment results have shown that a frequency warping is 2.06%, the power spectrum is 3.06%, and hybrid VTN is 4.07% word error rate reduction as of word recognition performance of baseline system.
PDF KSCI

Improving A Text Independent Speaker Identification System By Frame Level Likelihood Normalization (프레임단위유사도정규화를 이용한 문맥독립화자식별시스템의 성능 향상)

김민정;석수영;정현열;정호열
- Proceedings of the IEEK Conference
- /
- 2001.09a
- /
- pp.487-490
- /
- 2001
본 논문에서는 기존의 Caussian Mixture Model을 이용한 실시간문맥독립화자인식시스템의 성능을 향상시키기 위하여 화자검증시스템에서 좋은 결과를 나타내는 유사도정규화 ( Likelihood Normalization )방법을 화자식별시스템에 적용하여 시스템을 구현하였으며, 인식실험한 결과에 대해 보고한다. 시스템은 화자모델생성단과 화자식별단으로 구성하였으며, 화자모델생성단에서는, 화자발성의 음향학적 특징을 잘 표현할 수 있는 GMM(Gaussian Mixture Model)을 이용하여 화자모델을 작성하였으며. GMM의 파라미터를 최적화하기 위하여 MLE(Maximum Likelihood Estimation)방법을 사용하였다. 화자식별단에서는 학습된 데이터와 테스트용 데이터로부터 ML(Maximum Likelihood)을 이용하여 프레임단위로 유사도를 계산하였다. 계산된 유사도는 유사도 정규화 과정을 거쳐 스코어( SC)로 표현하였으며, 가장 높은 스코어를 가지는 화자를 인식화자로 결정한다. 화자인식에서 발성의 종류로는 문맥독립 문장을 사용하였다. 인식실험을 위해서는 ETRI445 DB와 KLE452 DB를 사용하였으며. 특징파라미터로서는 켑스트럼계수 및 회귀계수값만을 사용하였다. 인식실험에서는 등록화자의 수를 달리하여 일반적인 화자식별방법과 프레임단위유사도정규화방법으로 각각 인식실험을 하였다. 인식실험결과, 프레임단위유사도정규화방법이 인식화자수가 많아지는 경우에 일반적인 방법보다 향상된 인식률을 얻을수 있었다.
PDF

On using the LPC parameter for Speaker Identification (LPC에 의한 화자 식별)

조병모
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1987.11a
- /
- pp.82-85
- /
- 1987
Preliminary results of using the LPC parameter for text-independent speaker identification problem are presented. The idetification process includes log likelihood ratio for distance measure and dynamic programming for time normalization. To generate the data base for experiments, ten times. Experimental results show 99.4% of identification accuracy, incorrect identification were made when the speaker uses a dialect.
PDF

An Improved V-BLAST Receiver based on Relibability Normalization (신뢰성 정규화를 기반으로 한 개선된 V-BLAST 수신기 구조에 관한 연구)

Kim, Hyoun-Kuk;Park, Hyun-Cheol
- Proceedings of the IEEK Conference
- /
- 2004.06a
- /
- pp.71-74
- /
- 2004
We present an improved V-BLAST receiver that cancels co-channel interference (CCI), based on reliability normalization over frequency-selective channels, in log-likelihood ratio (LLR) sense. The performance has been evaluated in the exponential decay channel model with various normalized ms delay spread and different filter taps. It is also compared with the ordered successive interference cancellation-decision feedback equalizer (OSIC-DFE). Simulation results show that the performance of the proposed receiver with (2,1) is close to OSIC-DFE with (6,3) at normalized ms delay spread 0.5 symbol periods.
PDF

New Postprocessing Methods for Rejectin Out-of-Vocabulary Words

Song, Myung-Gyu
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.3E
- /
- pp.19-23
- /
- 1997
The goal of postprocessing in automatic speech recognition is to improve recognition performance by utterance verification at the output of recognition stage. It is focused on the effective rejection of out-of vocabulary words based on the confidence score of hypothesized candidate word. We present two methods for computing confidence scores. Both methods are based on the distance between each observation vector and the representative code vector, which is defined by the most likely code vector at each state. While the first method employs simple time normalization, the second one uses a normalization technique based on the concept of on-line garbage mode[1]. According to the speaker independent isolated words recognition experiment with discrete density HMM, the second method outperforms both the first one and conventional likelihood ratio scoring method[2].
PDF

Scene Change Detection with 3-Step Process (3단계 과정의 장면 전환검출)

Yoon, Shin-Seong;Won, Rhee-Yang
- Journal of the Korea Society of Computer and Information
- /
- v.13 no.6
- /
- pp.147-154
- /
- 2008
First, this paper compute difference value between frames using the composed method of $X^2$ histogram and color histogram and the normalization. Next, cluster representative frame was decided by using the clustering for distance and the k-mean grouping. Finally, representative frame of group was decided by using the likelihood ratio. Proposed method can be known by experiment as outstanding of detection rather than other methods, due to computing of difference value, clustering and grouping, and detecting of representative frame.
PDF

A note on Box-Cox transformation and application in microarray data

Rahman, Mezbahur;Lee, Nam-Yong
- Journal of the Korean Data and Information Science Society
- /
- v.22 no.5
- /
- pp.967-976
- /
- 2011
The Box-Cox transformation is a well known family of power transformations that brings a set of data into agreement with the normality assumption of the residuals and hence the response variable of a postulated model in regression analysis. Normalization (studentization) of the regressors is a common practice in analyzing microarray data. Here, we implement Box-Cox transformation in normalizing regressors in microarray data. Pridictabilty of the model can be improved using data transformation compared to studentization.
PDF KSCI

A Missing Data Imputation by Combining K Nearest Neighbor with Maximum Likelihood Estimation for Numerical Software Project Data (K-NN과 최대 우도 추정법을 결합한 소프트웨어 프로젝트 수치 데이터용 결측값 대치법)

Lee, Dong-Ho;Yoon, Kyung-A;Bae, Doo-Hwan
- Journal of KIISE:Software and Applications
- /
- v.36 no.4
- /
- pp.273-282
- /
- 2009
Missing data is one of the common problems in building analysis or prediction models using software project data. Missing imputation methods are known to be more effective missing data handling method than deleting methods in small software project data. While K nearest neighbor imputation is a proper missing imputation method in the software project data, it cannot use non-missing information of incomplete project instances. In this paper, we propose an approach to missing data imputation for numerical software project data by combining K nearest neighbor and maximum likelihood estimation; we also extend the average absolute error measure by normalization for accurate evaluation. Our approach overcomes the limitation of K nearest neighbor imputation and outperforms on our real data sets.
PDF KSCI

Search Result 16, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)