• Title/Summary/Keyword: Mixture of Gaussian

Search Result 507, Processing Time 0.026 seconds

Predicting Unknown Composition of a Mixture Using Independent Component Analysis (독립성분분석을 이용한 혼합물의 미지성분비율 예측)

  • Lee Hye-Seon;Song Jae-Kee;Park Hae-Sang;Jun Chi-Hyuck
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.135-148
    • /
    • 2006
  • Independent component analysis (ICA) is a statistical method for transforming an observed high-dimensional multivariate data into statistically independent components. ICA has been applied increasingly in wide fields of spectrum application since ICA is able to extract unknown components of a mixture from spectra. We focus on application of ICA for separating independent sources and predicting each composition using extracted components. The theory of ICA is introduced and an application to a metal surface spectra data will be described, where subsequent analysis using non-negative least square method is performed to predict composition ratio of each sample. Furthermore, some simulation experiments are performed to demonstrate the performance of the proposed approach.

A Study on the Perlormance Variations of the Mobile Phone Speaker Verification System According to the Various Background Speaker Properties (휴대폰음성을 이용한 화자인증시스템에서 배경화자에 따른 성능변화에 관한 연구)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.105-114
    • /
    • 2005
  • It was verified that a speaker verification system improved its performances of EER by regularizing log likelihood ratio, using background speaker models. Recently the wireless mobile phones are becoming more dominant communication terminals than wired phones. So the need for building a speaker verification system on mobile phone is increasing abruptly. Therefore in this paper, we had some experiments to examine the performance of speaker verification based on mobile phone's voices. Especially we are focused on the performance variations in EER(Equal Error Rate) according to several background speaker's characteristics, such as selecting methods(MSC, MIX), number of background speakers, aging factor of speech database. For this, we constructed a speaker verification system that uses GMM(Gaussin Mixture Model) and found that the MIX method is generally superior to another method by about 1.0% EER. In aspect of number of background speakers, EER is decreasing in proportion to the background speakers populations. As the number is increasing as 6, 10 and 16, the EERs are recorded as 13.0%, 12.2%, and 11.6%. An unexpected results are happened in aging effects of the speech database on the performance. EERs are measured as 4%, 12% and 19% for each seasonally recorded databases from session 1 to session 3, respectively, where duration gap between sessions is set by 3 months. Although seasons speech database has 10 speakers and 10 sentences per each, which gives less statistical confidence to results, we confirmed that enrolled speaker models in speaker verification system should be regularly updated using the ongoing claimant's utterances.

  • PDF

Speaker Recognition Performance Improvement by Voiced/Unvoiced Classification and Heterogeneous Feature Combination (유/무성음 구분 및 이종적 특징 파라미터 결합을 이용한 화자인식 성능 개선)

  • Kang, Jihoon;Jeong, Sangbae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.6
    • /
    • pp.1294-1301
    • /
    • 2014
  • In this paper, separate probabilistic distribution models for voiced and unvoiced speech are estimated and utilized to improve speaker recognition performance. Also, in addition to the conventional mel-frequency cepstral coefficient, skewness, kurtosis, and harmonic-to-noise ratio are extracted and used for voiced speech intervals. Two kinds of scores for voiced and unvoiced speech are linearly fused with the optimal weight found by exhaustive search. The performance of the proposed speaker recognizer is compared with that of the conventional recognizer which uses mel-frequency cepstral coefficient and a unified probabilistic distribution function based on the Gassian mixture model. Experimental results show that the lower the number of Gaussian mixture, the greater the performance improvement by the proposed algorithm.

Implementation of HMM-Based Speech Recognizer Using TMS320C6711 DSP

  • Bae Hyojoon;Jung Sungyun;Son Jongmok;Kwon Hongseok;Kim Siho;Bae Keunsung
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.391-394
    • /
    • 2004
  • This paper focuses on the DSP implementation of an HMM-based speech recognizer that can handle several hundred words of vocabulary size as well as speaker independency. First, we develop an HMM-based speech recognition system on the PC that operates on the frame basis with parallel processing of feature extraction and Viterbi decoding to make the processing delay as small as possible. Many techniques such as linear discriminant analysis, state-based Gaussian selection, and phonetic tied mixture model are employed for reduction of computational burden and memory size. The system is then properly optimized and compiled on the TMS320C6711 DSP for real-time operation. The implemented system uses 486kbytes of memory for data and acoustic models, and 24.5kbytes for program code. Maximum required time of 29.2ms for processing a frame of 32ms of speech validates real-time operation of the implemented system.

  • PDF

Emotion Recognition using Pitch Parameters of Speech (음성의 피치 파라메터를 사용한 감정 인식)

  • Lee, Guehyun;Kim, Weon-Goo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.3
    • /
    • pp.272-278
    • /
    • 2015
  • This paper studied various parameter extraction methods using pitch information of speech for the development of the emotion recognition system. For this purpose, pitch parameters were extracted from korean speech database containing various emotions using stochastical information and numerical analysis techniques. GMM based emotion recognition system were used to compare the performance of pitch parameters. Sequential feature selection method were used to select the parameters showing the best emotion recognition performance. Experimental results of recognizing four emotions showed 63.5% recognition rate using the combination of 15 parameters out of 56 pitch parameters. Experimental results of detecting the presence of emotion showed 80.3% recognition rate using the combination of 14 parameters.

Optical and Near-Infrared Color Distributions of the NGC 4874 Globular Cluster System

  • Cho, Hye-Jeon;Blakeslee, John P.;Lee, Young-Wook
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.37 no.1
    • /
    • pp.61.1-61.1
    • /
    • 2012
  • We examine both optical and optical/near-infrared (NIR) color distributions of the globular cluster (GC) system in the core of the Coma cluster of galaxies (Abell 1656), centered on the giant elliptical galaxy NGC 4874, to study how non-linearities in the color-metallicity relations of GC systems in large elliptical galaxies are linked to bimodal optical color distributions. Since optical-NIR color distributions of extragalactic GC systems reflect the underlying features of the metallicity distributions, we also present the color-color relation for this GC system. In order to do this, we combine F160W ($H_{160}$) NIR imaging data acquired with the Wide Field Camera 3 IR Channel (WFC3/IR), newly installed on Hubble Space Telescope (HST), with F475W ($g_{475}$) and FF814W ($I_{814}$) optical imaging data from the HST Advanced Camera for Surveys (ACS). To quantitatively explain the feature of color distributions, we use the Gaussian Mixture Modeling (GMM) code. Finally, we show the radial distribution of the GCs in the field of NGC 4874.

  • PDF

Optical and Near-IR Photometry of the NGC 4874 Globular Cluster System with the Hubble Space Telescope

  • Cho, Hyejeon;Blakeslee, John P.;Peng, Eric W.;Lee, Young-Wook
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.38 no.2
    • /
    • pp.37.1-37.1
    • /
    • 2013
  • We present our study of analyzing the photometric properties of the globular cluster (GC) system which resides in the extended halo of the central bright Coma cluster galaxy NGC 4874. The core of the Coma cluster of galaxies (Abell 1656) was observed with both the HST Advanced Camera for Surveys (ACS) in the F475W (g475) and F814W (I814) and Wide Field Camera 3 IR Channel (WFC3/IR) in the F160W (H160) filters. The data analysis procedure and GC candidate selection criteria are briefly described. We investigate the interesting "tilt" features in color-magnitude diagrams for this GC system and their link to the nonlinear color-metallicity relation for GCs. The NGC 4874's GC system exhibits a bimodal distribution in the optical g475-I814 color and much more than half the GCs fall in the red side at g475-I814 ~ 1.1. This bimodality is weakened in the optical-IR I814-H160 color; the quantitative analysis on the features of both color distributions using the Gaussian Mixture Modeling code proves the bimodalities are different. Both colors, thus, cannot linearly reflect the bimodality of an underlying metallicity, supporting the suggestion that observed bimodalities in extragalactic GC colors are the metallicity-to-color projection effect.

  • PDF

Improving the Performance of the Continuous Speech Recognition by Estimating Likelihoods of the Phonetic Rules (음소변동규칙의 적합도 조정을 통한 연속음성인식 성능향상)

  • Na, Min-Soo;Chung, Min-Hwa
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.80-83
    • /
    • 2006
  • The purpose of this paper is to build a pronunciation lexicon with estimated likelihoods of the phonetic rules based on the phonetic realizations and therefore to improve the performance of CSR using the dictionary. In the baseline system, the phonetic rules and their application probabilities are defined with the knowledge of Korean phonology and experimental tuning. The advantage of this approach is to implement the phonetic rules easily and to get stable results on general domains. However, a possible drawback of this method is that it is hard to reflect characteristics of the phonetic realizations on a specific domain. In order to make the system reflect phonetic realizations, the likelihood of phonetic rules is reestimated based on the statistics of the realized phonemes using a forced-alignment method. In our experiment, we generates new lexica which include pronunciation variants created by reestimated phonetic rules and its performance is tested with 12 Gaussian mixture HMMs and back-off bigrams. The proposed method reduced the WER by 0.42%.

  • PDF

Moving Object Detection Robust to Sudden illumination Change using Modified Texture Information (개선된 텍스쳐 정보를 이용한 갑작스러운 조명 변화에 강인한 이동 물체 탐지)

  • O, Yoe-Han;Chang, Hyung-Jin;Kim, Soo-Wan;Choi, Jin-Young
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.268-269
    • /
    • 2008
  • Moving object detection is a fundamental technique in visual surveillance. Robust technique to enhance performance of moving object detection is required for several bad conditions in real external circumtance. In case of sudden illumination change in outdoor condition, many objects are determined as moving object though they are not really moving, but just their illumination changes. This makes the detection result untrustworthy. In this paper, robust moving object detection to sudden illumination change using gaussian mixture background model and new texture information using background from the weighted sum of recent images is proposed.

  • PDF

Implementation of HMM Based Speech Recognizer with Medium Vocabulary Size Using TMS320C6201 DSP (TMS320C6201 DSP를 이용한 HMM 기반의 음성인식기 구현)

  • Jung, Sung-Yun;Son, Jong-Mok;Bae, Keun-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.1E
    • /
    • pp.20-24
    • /
    • 2006
  • In this paper, we focused on the real time implementation of a speech recognition system with medium size of vocabulary considering its application to a mobile phone. First, we developed the PC based variable vocabulary word recognizer having the size of program memory and total acoustic models as small as possible. To reduce the memory size of acoustic models, linear discriminant analysis and phonetic tied mixture were applied in the feature selection process and training HMMs, respectively. In addition, state based Gaussian selection method with the real time cepstral normalization was used for reduction of computational load and robust recognition. Then, we verified the real-time operation of the implemented recognition system on the TMS320C6201 EVM board. The implemented recognition system uses memory size of about 610 kbytes including both program memory and data memory. The recognition rate was 95.86% for ETRI 445DB, and 96.4%, 97.92%, 87.04% for three kinds of name databases collected through the mobile phones.