• Title/Summary/Keyword: mixture of Gaussian model method

Search Result 262, Processing Time 0.031 seconds

Lip-Synch System Optimization Using Class Dependent SCHMM (클래스 종속 반연속 HMM을 이용한 립싱크 시스템 최적화)

  • Lee, Sung-Hee;Park, Jun-Ho;Ko, Han-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.312-318
    • /
    • 2006
  • The conventional lip-synch system has a two-step process, speech segmentation and recognition. However, the difficulty of speech segmentation procedure and the inaccuracy of training data set due to the segmentation lead to a significant Performance degradation in the system. To cope with that, the connected vowel recognition method using Head-Body-Tail (HBT) model is proposed. The HBT model which is appropriate for handling relatively small sized vocabulary tasks reflects co-articulation effect efficiently. Moreover the 7 vowels are merged into 3 classes having similar lip shape while the system is optimized by employing a class dependent SCHMM structure. Additionally in both end sides of each word which has large variations, 8 components Gaussian mixture model is directly used to improve the ability of representation. Though the proposed method reveals similar performance with respect to the CHMM based on the HBT structure. the number of parameters is reduced by 33.92%. This reduction makes it a computationally efficient method enabling real time operation.

Bayesian Image Denoising with Mixed Prior Using Hypothesis-Testing Problem (가설-검증 문제를 이용한 혼합 프라이어를 가지는 베이지안 영상 잡음 제거)

  • Eom Il-Kyu;Kim Yoo-Shin
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.3 s.309
    • /
    • pp.34-42
    • /
    • 2006
  • In general, almost information is stored in only a few wavelet coefficients. This sparse characteristic of wavelet coefficient can be modeled by the mixture of Gaussian probability density function and point mass at zero, and denoising for this prior model is peformed by using Bayesian estimation. In this paper, we propose a method of parameter estimation for denoising using hypothesis-testing problem. Hypothesis-testing problem is applied to variance of wavelet coefficient, and $X^2$-test is used. Simulation results show our method outperforms about 0.3dB higher PSNR(peak signal-to-noise ratio) gains compared to the states-of-art denoising methods when using orthogonal wavelets.

Multimodal Emotion Recognition using Face Image and Speech (얼굴영상과 음성을 이용한 멀티모달 감정인식)

  • Lee, Hyeon Gu;Kim, Dong Ju
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.8 no.1
    • /
    • pp.29-40
    • /
    • 2012
  • A challenging research issue that has been one of growing importance to those working in human-computer interaction are to endow a machine with an emotional intelligence. Thus, emotion recognition technology plays an important role in the research area of human-computer interaction, and it allows a more natural and more human-like communication between human and computer. In this paper, we propose the multimodal emotion recognition system using face and speech to improve recognition performance. The distance measurement of the face-based emotion recognition is calculated by 2D-PCA of MCS-LBP image and nearest neighbor classifier, and also the likelihood measurement is obtained by Gaussian mixture model algorithm based on pitch and mel-frequency cepstral coefficient features in speech-based emotion recognition. The individual matching scores obtained from face and speech are combined using a weighted-summation operation, and the fused-score is utilized to classify the human emotion. Through experimental results, the proposed method exhibits improved recognition accuracy of about 11.25% to 19.75% when compared to the most uni-modal approach. From these results, we confirmed that the proposed approach achieved a significant performance improvement and the proposed method was very effective.

Automatic Extraction of UV patterns for Paper Money Inspection (지폐검사를 위한 UV 패턴의 자동추출)

  • Lee, Geon-Ho;Park, Tae-Hyoung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.3
    • /
    • pp.365-371
    • /
    • 2011
  • Most recently issued paper money includes security patterns that can be only identified by ultra violet (UV) illuminations. We propose an automatic extraction method of UV patterns for paper money inspection systems. The image acquired by camera and UV illumination is transformed to input data through preprocessing. And then, the Gaussian mixture model (GMM) and split-and-merge expectation maximization (SMEM) algorithm are applied to segment the image represented by input data. In order to extract the UV pattern from the segmented image, we develop a criterion using the area of covariance vector and the weight value. The experimental results on various paper money are presented to verify the usefulness of the proposed method.

Safety Robust Speaker Recognition Against Utterance Variationsed (발성변화에 강인한 화자 인식에 관한 연구)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.2
    • /
    • pp.69-73
    • /
    • 2004
  • A speaker model In speaker recognition system is to be trained from a large data set gathered in multiple sessions. Large data set requires large amount of memory and computation, and moreover it's practically hard to make users utter the data inseveral sessions. Recently the incremental adaptation methods are proposed to cover the problems, However, the data set gathered from multiple sessions is vulnerable to the outliers from the irregular utterance variations and the presence of noise, which result in inaccurate speaker model. In this paper, we propose an incremental robust adaptation method to minimize the influence of outliers on Gaussian Mixture Madel based speaker model. The robust adaptation is obtained from an incremental version of M-estimation. Speaker model is initially trained from small amount of data and it is adapted recursively with the data available in each session, Experimental results from the data set gathered over seven months show that the proposed method is robust against outliers.

  • PDF

On-Road Car Detection System Using VD-GMM 2.0 (차량검출 GMM 2.0을 적용한 도로 위의 차량 검출 시스템 구축)

  • Lee, Okmin;Won, Insu;Lee, Sangmin;Kwon, Jangwoo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.11
    • /
    • pp.2291-2297
    • /
    • 2015
  • This paper presents a vehicle detection system using the video as a input image what has moving of vehicles.. Input image has constraints. it has to get fixed view and downward view obliquely from top of the road. Road detection is required to use only the road area in the input image. In introduction, we suggest the experiment result and the critical point of motion history image extraction method, SIFT(Scale_Invariant Feature Transform) algorithm and histogram analysis to detect vehicles. To solve these problem, we propose using applied Gaussian Mixture Model(GMM) that is the Vehicle Detection GMM(VDGMM). In addition, we optimize VDGMM to detect vehicles more and named VDGMM 2.0. In result of experiment, each precision, recall and F1 rate is 9%, 53%, 15% for GMM without road detection and 85%, 77%, 80% for VDGMM2.0 with road detection.

Graph Cut-based Automatic Color Image Segmentation using Mean Shift Analysis (Mean Shift 분석을 이용한 그래프 컷 기반의 자동 칼라 영상 분할)

  • Park, An-Jin;Kim, Jung-Whan;Jung, Kee-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.11
    • /
    • pp.936-946
    • /
    • 2009
  • A graph cuts method has recently attracted a lot of attentions for image segmentation, as it can globally minimize energy functions composed of data term that reflects how each pixel fits into prior information for each class and smoothness term that penalizes discontinuities between neighboring pixels. In previous approaches to graph cuts-based automatic image segmentation, GMM(Gaussian mixture models) is generally used, and means and covariance matrixes calculated by EM algorithm were used as prior information for each cluster. However, it is practicable only for clusters with a hyper-spherical or hyper-ellipsoidal shape, as the cluster was represented based on the covariance matrix centered on the mean. For arbitrary-shaped clusters, this paper proposes graph cuts-based image segmentation using mean shift analysis. As a prior information to estimate the data term, we use the set of mean trajectories toward each mode from initial means randomly selected in $L^*u^*{\upsilon}^*$ color space. Since the mean shift procedure requires many computational times, we transform features in continuous feature space into 3D discrete grid, and use 3D kernel based on the first moment in the grid, which are needed to move the means to modes. In the experiments, we investigate the problems of mean shift-based and normalized cuts-based image segmentation methods that are recently popular methods, and the proposed method showed better performance than previous two methods and graph cuts-based automatic image segmentation using GMM on Berkeley segmentation dataset.

A Study on User Authentication with Smartphone Accelerometer Sensor (스마트폰 가속도 센서를 이용한 사용자 인증 방법 연구)

  • Seo, Jun-seok;Moon, Jong-sub
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.6
    • /
    • pp.1477-1484
    • /
    • 2015
  • With the growth of financial industry with smartphone, interest on user authentication using smartphone has been arisen in these days. There are various type of biometric user authentication techniques, but gait recognition using accelerometer sensor in smartphone does not seem to develop remarkably. This paper suggests the method of user authentication using accelerometer sensor embedded in smartphone. Specifically, calibrate the sensor data from smartphone with 3D-transformation, extract features from transformed data and do principle component analysis, and learn model with using gaussian mixture model. Next, authenticate user data with confidence interval of GMM model. As result, proposed method is capable of user authentication with accelerometer sensor on smartphone as a high degree of accuracy(about 96%) even in the situation that environment control and limitation are minimum on the research.

Text Independent Speaker Verficiation Using Dominant State Information of HMM-UBM (HMM-UBM의 주 상태 정보를 이용한 음성 기반 문맥 독립 화자 검증)

  • Shon, Suwon;Rho, Jinsang;Kim, Sung Soo;Lee, Jae-Won;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.2
    • /
    • pp.171-176
    • /
    • 2015
  • We present a speaker verification method by extracting i-vectors based on dominant state information of Hidden Markov Model (HMM) - Universal Background Model (UBM). Ergodic HMM is used for estimating UBM so that various characteristic of individual speaker can be effectively classified. Unlike Gaussian Mixture Model(GMM)-UBM based speaker verification system, the proposed system obtains i-vectors corresponding to each HMM state. Among them, the i-vector for feature is selected by extracting it from the specific state containing dominant state information. Relevant experiments are conducted for validating the proposed system performance using the National Institute of Standards and Technology (NIST) 2008 Speaker Recognition Evaluation (SRE) database. As a result, 12 % improvement is attained in terms of equal error rate.

Emotion Recognition using Pitch Parameters of Speech (음성의 피치 파라메터를 사용한 감정 인식)

  • Lee, Guehyun;Kim, Weon-Goo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.3
    • /
    • pp.272-278
    • /
    • 2015
  • This paper studied various parameter extraction methods using pitch information of speech for the development of the emotion recognition system. For this purpose, pitch parameters were extracted from korean speech database containing various emotions using stochastical information and numerical analysis techniques. GMM based emotion recognition system were used to compare the performance of pitch parameters. Sequential feature selection method were used to select the parameters showing the best emotion recognition performance. Experimental results of recognizing four emotions showed 63.5% recognition rate using the combination of 15 parameters out of 56 pitch parameters. Experimental results of detecting the presence of emotion showed 80.3% recognition rate using the combination of 14 parameters.