• Title/Summary/Keyword: Gaussian mixture method

Search Result 303, Processing Time 0.03 seconds

Estimation of Mixture Numbers of GMM for Speaker Identification (화자 식별을 위한 GMM의 혼합 성분의 개수 추정)

  • Lee, Youn-Jeong;Lee, Ki-Yong
    • Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.237-245
    • /
    • 2004
  • In general, Gaussian mixture model(GMM) is used to estimate the speaker model for speaker identification. The parameter estimates of the GMM are obtained by using the expectation-maximization (EM) algorithm for the maximum likelihood(ML) estimation. However, if the number of mixtures isn't defined well in the GMM, those parameters are obtained inappropriately. The problem to find the number of components is significant to estimate the optimal parameter in mixture model. In this paper, to estimate the optimal number of mixtures, we propose the method that starts from the sufficient mixtures, after, the number is reduced by investigating the mutual information between mixtures for GMM. In result, we can estimate the optimal number of mixtures. The effectiveness of the proposed method is shown by the experiment using artificial data. Also, we performed the speaker identification applying the proposed method comparing with other approaches.

  • PDF

Estimation of Spatial Distribution Using the Gaussian Mixture Model with Multivariate Geoscience Data (다변량 지구과학 데이터와 가우시안 혼합 모델을 이용한 공간 분포 추정)

  • Kim, Ho-Rim;Yu, Soonyoung;Yun, Seong-Taek;Kim, Kyoung-Ho;Lee, Goon-Taek;Lee, Jeong-Ho;Heo, Chul-Ho;Ryu, Dong-Woo
    • Economic and Environmental Geology
    • /
    • v.55 no.4
    • /
    • pp.353-366
    • /
    • 2022
  • Spatial estimation of geoscience data (geo-data) is challenging due to spatial heterogeneity, data scarcity, and high dimensionality. A novel spatial estimation method is needed to consider the characteristics of geo-data. In this study, we proposed the application of Gaussian Mixture Model (GMM) among machine learning algorithms with multivariate data for robust spatial predictions. The performance of the proposed approach was tested through soil chemical concentration data from a former smelting area. The concentrations of As and Pb determined by ex-situ ICP-AES were the primary variables to be interpolated, while the other metal concentrations by ICP-AES and all data determined by in-situ portable X-ray fluorescence (PXRF) were used as auxiliary variables in GMM and ordinary cokriging (OCK). Among the multidimensional auxiliary variables, important variables were selected using a variable selection method based on the random forest. The results of GMM with important multivariate auxiliary data decreased the root mean-squared error (RMSE) down to 0.11 for As and 0.33 for Pb and increased the correlations (r) up to 0.31 for As and 0.46 for Pb compared to those from ordinary kriging and OCK using univariate or bivariate data. The use of GMM improved the performance of spatial interpretation of anthropogenic metals in soil. The multivariate spatial approach can be applied to understand complex and heterogeneous geological and geochemical features.

A Variable Parameter Model based on SSMS for an On-line Speech and Character Combined Recognition System (음성 문자 공용인식기를 위한 SSMS 기반 가변 파라미터 모델)

  • 석수영;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.528-538
    • /
    • 2003
  • A SCCRS (Speech and Character Combined Recognition System) is developed for working on mobile devices such as PDA (Personal Digital Assistants). In SCCRS, the feature extraction is separately carried out for speech and for hand-written character, but the recognition is performed in a common engine. The recognition engine employs essentially CHMM (Continuous Hidden Markov Model), which consists of variable parameter topology in order to minimize the number of model parameters and to reduce recognition time. For generating contort independent variable parameter model, we propose the SSMS(Successive State and Mixture Splitting), which gives appropriate numbers of mixture and of states through splitting in mixture domain and in time domain. The recognition results show that the proposed SSMS method can reduce the total number of GOPDD (Gaussian Output Probability Density Distribution) up to 40.0% compared to the conventional method with fixed parameter model, at the same recognition performance in speech recognition system.

Video Based Pedestrian Height Estimation Using Winer Optimization (위너 최적화 기법을 이용한 영상기반 보행자 키 추정)

  • Jeon, Sang Hee;Song, Jong Kwan;Park, Jang Sik;Yoon, Byung Woo
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.2
    • /
    • pp.264-270
    • /
    • 2016
  • In this paper, we proposed a method which can detect pedestrians from CCTV video and estimate the height of the detected objects. We separate the foreground using Gaussian mixture model and the pedestrian is detected using the conditions such as the width-height ratio and the size of the candidate objects. In order to obtain the optimal model for estimating the height of pedestrian, we get many training data from the pedestrian whose height is known. Using these training data, we designed optimal Wiener height estimator and used to estimate the height of pedestrians. The height of the pedestrian at various distance is estimated and the accuracy is evaluated. In the experimental results, proposed method shows that it can estimate the height of pedestrian for various positions effectively.

Applying feature normalization based on pole filtering to short-utterance speech recognition using deep neural network (심층신경망을 이용한 짧은 발화 음성인식에서 극점 필터링 기반의 특징 정규화 적용)

  • Han, Jaemin;Kim, Min Sik;Kim, Hyung Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.1
    • /
    • pp.64-68
    • /
    • 2020
  • In a conventional speech recognition system using Gaussian Mixture Model-Hidden Markov Model (GMM-HMM), the cepstral feature normalization method based on pole filtering was effective in improving the performance of recognition of short utterances in noisy environments. In this paper, the usefulness of this method for the state-of-the-art speech recognition system using Deep Neural Network (DNN) is examined. Experimental results on AURORA 2 DB show that the cepstral mean and variance normalization based on pole filtering improves the recognition performance of very short utterances compared to that without pole filtering, especially when there is a large mismatch between the training and test conditions.

Batch Time Interval and Initial State Estimation using GMM-TS for Target Motion Analysis (GMM-TS를 이용한 표적기동분석용 배치구간 및 초기상태 추정 기법)

  • Kim, Woo-Chan;Song, Taek-Lyul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.18 no.3
    • /
    • pp.285-294
    • /
    • 2012
  • Using bearing measurement only, target motion state is not directly obtained so that TMA (Target Motion Analysis) is needed for this situation. TMA is a nonlinear estimation technique used in passive SONAR systems. Also it is the one of important techniques for underwater combat management systems. TMA can be divided to two parts: batch estimation and sequential estimation. It is preferable to use sequential estimation for reducing computational load as well as adaptively to target maneuvers, batch estimation is still required to attain target initial state vector for convergence of sequential estimation. Selection of batch time interval which depends on observability is critical in TMA performance. Batch estimation in general utilizes predetermined batch time interval. In this paper, we propose a new method called the BTIS (Batch Time Interval and Initial State Estimation). The proposed BTIS estimates target initial status and determines the batch time interval sequentially by using a bank of GMM-TS (Gaussian Mixture Measurement-Track Splitting) filters. The performance of the proposal method is verified by a Monte Carlo simulation study.

Data Detection Algorithm Based on GMM in the Acoustic Data Transmission System (음향 데이터 전송 시스템의 강인한 데이터 검출 성능을 위한 Gaussian Mixture Model 기반 연구)

  • Song, Ji-Hyun;Chang, Joon-Hyuk;Kim, Moon-Kee;Kim, Dong-Keon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.4
    • /
    • pp.136-141
    • /
    • 2011
  • In this paper, we propose an approach to improve the data detection performance of the acoustic data transmission system based on the modulated complex lapped transform (MCLT). We first present an effective analysis of the features and the detection method of data in the acoustic data transmission system. And then feature vectors which are applied to the Gaussian mixture model (GMM) are selected from relevant parameters of the previous system for the efficient data detection. For the purpose of evaluating the performance of the proposed algorithm, Bit error rate (BER) of the received data was measured at different environments (music genres (rock, pop, classic, jazz) and different distances (1m∼5m) from the loudspeaker to the microphone in a office room) and yields better results compared with the conventional scheme of the acoustic data transmission system based on the MCLT.

Dynamic Control of Learning Rate in the Improved Adaptive Gaussian Mixture Model for Background Subtraction (배경분리를 위한 개선된 적응적 가우시안 혼합모델에서의 동적 학습률 제어)

  • Kim, Young-Ju
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.366-369
    • /
    • 2005
  • Background subtraction is mainly used for the real-time extraction and tracking of moving objects from image sequences. In the outdoor environment, there are many changeable factor such as gradually changing illumination, swaying trees and suddenly moving objects, which are to be considered for the adaptive processing. Normally, GMM(Gaussian Mixture Model) is used to subtract the background adaptively considering the various changes in the scenes, and the adaptive GMMs improving the real-time performance were worked. This paper, for on-line background subtraction, applied the improved adaptive GMM, which uses the small constant for learning rate ${\alpha}$ and is not able to speedily adapt the suddenly movement of objects, So, this paper proposed and evaluated the dynamic control method of ${\alpha}$ using the adaptive selection of the number of component distributions and the global variances of pixel values.

  • PDF

Implementation of An Unmanned Visual Surveillance System with Embedded Control (임베디드 제어에 의한 무인 영상 감시시스템 구현)

  • Kim, Dong-Jin;Jung, Yong-Bae;Park, Young-Seak;Kim, Tae-Hyo
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.12 no.1
    • /
    • pp.13-19
    • /
    • 2011
  • In this paper, a visual surveillance system using SOPC based NIOS II embedded processor and C2H compiler was implemented. In this system, the IP is constructed by C2H compiler for the output of the camera images, image processing, serial communication and network communication, then, it is implemented to effectively control each IP based on the SOPC and the NIOS II embedded processor. And, an algorithm which updates the background images for high speed and robust detection of the moving objects is proposed using the Adaptive Gaussian Mixture Model(AGMM). In results, it can detecte the moving objects(pedestrians and vehicles) under day-time and night-time. It is confirmed that the proposed AGMM algorithm has better performance than the Adaptive Threshold Method(ATM) and the Gaussian Mixture Model(GMM) from our experiments.

Feature extraction method using graph Laplacian for LCD panel defect classification (LCD 패널 상의 불량 검출을 위한 스펙트럴 그래프 이론에 기반한 특성 추출 방법)

  • Kim, Gyu-Dong;Yoo, Suk-I.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.522-524
    • /
    • 2012
  • For exact classification of the defect, good feature selection and classifier is necessary. In this paper, various features such as brightness features, shape features and statistical features are stated and Bayes classifier using Gaussian mixture model is used as classifier. Also feature extraction method based on spectral graph theory is presented. Experimental result shows that feature extraction method using graph Laplacian result in better performance than the result using PCA.