• Title/Summary/Keyword: GMM method

Search Result 300, Processing Time 0.023 seconds

Improved Minimum Statistics Based on Environment-Awareness for Noise Power Estimation (환경인식 기반의 향상된 Minimum Statistics 잡음전력 추정기법)

  • Son, Young-Ho;Choi, Jae-Hun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.3
    • /
    • pp.123-128
    • /
    • 2011
  • In this paper, we propose the improved noise power estimation in speech enhancement under various noise environments. The previous MS algorithm tracking the minimum value of finite search window uses the optimal power spectrum of signal for smoothing and adopts minimum probability. From the investigation of the previous MS-based methods it can be seen that a fixed size of the minimum search window is assumed regardless of the various environment. To achieve the different search window size, we use the noise classification algorithm based on the Gaussian mixture model (GMM). Performance of the proposed enhancement algorithm is evaluated by ITU-T P.862 perceptual evaluation of speech quality (PESQ) under various noise environments. Based on this, we show that the proposed algorithm yields better result compared to the conventional MS method.

Emotion Recognition Algorithm Based on Minimum Classification Error incorporating Multi-modal System (최소 분류 오차 기법과 멀티 모달 시스템을 이용한 감정 인식 알고리즘)

  • Lee, Kye-Hwan;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.4
    • /
    • pp.76-81
    • /
    • 2009
  • We propose an effective emotion recognition algorithm based on the minimum classification error (MCE) incorporating multi-modal system The emotion recognition is performed based on a Gaussian mixture model (GMM) based on MCE method employing on log-likelihood. In particular, the reposed technique is based on the fusion of feature vectors based on voice signal and galvanic skin response (GSR) from the body sensor. The experimental results indicate that performance of the proposal approach based on MCE incorporating the multi-modal system outperforms the conventional approach.

A Neuro-Fuzzy Modeling using the Hierarchical Clustering and Gaussian Mixture Model (계층적 클러스터링과 Gaussian Mixture Model을 이용한 뉴로-퍼지 모델링)

  • Kim, Sung-Suk;Kwak, Keun-Chang;Ryu, Jeong-Woong;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.5
    • /
    • pp.512-519
    • /
    • 2003
  • In this paper, we propose a neuro-fuzzy modeling to improve the performance using the hierarchical clustering and Gaussian Mixture Model(GMM). The hierarchical clustering algorithm has a property of producing unique parameters for the given data because it does not use the object function to perform the clustering. After optimizing the obtained parameters using the GMM, we apply them as initial parameters for Adaptive Network-based Fuzzy Inference System. Here, the number of fuzzy rules becomes to the cluster numbers. From this, we can improve the performance index and reduce the number of rules simultaneously. The proposed method is verified by applying to a neuro-fuzzy modeling for Box-Jenkins s gas furnace data and Sugeno's nonlinear system, which yields better results than previous oiles.

LSTM RNN-based Korean Speech Recognition System Using CTC (CTC를 이용한 LSTM RNN 기반 한국어 음성인식 시스템)

  • Lee, Donghyun;Lim, Minkyu;Park, Hosung;Kim, Ji-Hwan
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.93-99
    • /
    • 2017
  • A hybrid approach using Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) has showed great improvement in speech recognition accuracy. For training acoustic model based on hybrid approach, it requires forced alignment of HMM state sequence from Gaussian Mixture Model (GMM)-Hidden Markov Model (HMM). However, high computation time for training GMM-HMM is required. This paper proposes an end-to-end approach for LSTM RNN-based Korean speech recognition to improve learning speed. A Connectionist Temporal Classification (CTC) algorithm is proposed to implement this approach. The proposed method showed almost equal performance in recognition rate, while the learning speed is 1.27 times faster.

Classification of Seoul Metro Stations Based on Boarding/ Alighting Patterns Using Machine Learning Clustering (기계학습 클러스터링을 이용한 승하차 패턴에 따른 서울시 지하철역 분류)

  • Min, Meekyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.4
    • /
    • pp.13-18
    • /
    • 2018
  • In this study, we classify Seoul metro stations according to boarding and alighting patterns using machine earning technique. The target data is the number of boarding and alighting passengers per hour every day at 233 subway stations from 2008 to 2017 provided by the public data portal. Gaussian mixture model (GMM) and K-means clustering are used as machine learning techniques in order to classify subway stations. The distribution of the boarding time and the alighting time of the passengers can be modeled by the Gaussian mixture model. K-means clustering algorithm is used for unsupervised learning based on the data obtained by GMM modeling. As a result of the research, Seoul metro stations are classified into four groups according to boarding and alighting patterns. The results of this study can be utilized as a basic knowledge for analyzing the characteristics of Seoul subway stations and analyzing it economically, socially and culturally. The method of this research can be applied to public data and big data in areas requiring clustering.

Dynamic Control of Learning Rate in the Improved Adaptive Gaussian Mixture Model for Background Subtraction (배경분리를 위한 개선된 적응적 가우시안 혼합모델에서의 동적 학습률 제어)

  • Kim, Young-Ju
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.366-369
    • /
    • 2005
  • Background subtraction is mainly used for the real-time extraction and tracking of moving objects from image sequences. In the outdoor environment, there are many changeable factor such as gradually changing illumination, swaying trees and suddenly moving objects, which are to be considered for the adaptive processing. Normally, GMM(Gaussian Mixture Model) is used to subtract the background adaptively considering the various changes in the scenes, and the adaptive GMMs improving the real-time performance were worked. This paper, for on-line background subtraction, applied the improved adaptive GMM, which uses the small constant for learning rate ${\alpha}$ and is not able to speedily adapt the suddenly movement of objects, So, this paper proposed and evaluated the dynamic control method of ${\alpha}$ using the adaptive selection of the number of component distributions and the global variances of pixel values.

  • PDF

Performance Enhancement for Speaker Verification Using Incremental Robust Adaptation in GMM (가무시안 혼합모델에서 점진적 강인적응을 통한 화자확인 성능개선)

  • Kim, Eun-Young;Seo, Chang-Woo;Lim, Yong-Hwan;Jeon, Seong-Chae
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.268-272
    • /
    • 2009
  • In this paper, we propose a Gaussian Mixture Model (GMM) based incremental robust adaptation with a forgetting factor for the speaker verification. Speaker recognition system uses a speaker model adaptation method with small amounts of data in order to obtain a good performance. However, a conventional adaptation method has vulnerable to the outlier from the irregular utterance variations and the presence noise, which results in inaccurate speaker model. As time goes by, a rate in which new data are adapted to a model is reduced. The proposed algorithm uses an incremental robust adaptation in order to reduce effect of outlier and use forgetting factor in order to maintain adaptive rate of new data on GMM based speaker model. The incremental robust adaptation uses a method which registers small amount of data in a speaker recognition model and adapts a model to new data to be tested. Experimental results from the data set gathered over seven months show that the proposed algorithm is robust against outliers and maintains adaptive rate of new data.

Estimation of the optimal heated inlet air temperature for the beta-ray absorption method: analysis of the PM10 concentration difference by different methods in coastal areas

  • Shin, So Eun;Jung, Chang Hoon;Kim, Yong Pyo
    • Advances in environmental research
    • /
    • v.1 no.1
    • /
    • pp.69-82
    • /
    • 2012
  • Based on the measurement data of the particulate matter with an aerodynamic diameter of less than or equal to a nominal 10 ${\mu}m$ (PM10) by the ${\beta}$-ray absorption method (BAM) equipped with an inlet heater and the gravimetric method (GMM) at two coastal sites in Korea, the optimal inlet heater temperature was estimated. By using a gas/particle equilibrium model, Simulating Composition of Atmospheric Particles at Equilibrium 2 (SCAPE2), water content in aerosols was estimated with varying temperature to find the optimal temperature increase to make the PM10 concentration by BAM comparable to that by GMM. It was estimated that the heated air temperature inside the BAM should be increased up to $35{\sim}45^{\circ}C$ at both sites. At this temperature range, evaporation of volatile aerosol components was minor. Similar ($30{\sim}50^{\circ}C$) temperature range was also obtained from the calculation based on the absolute humidity which changed with ambient absolute humidity and chemical composition of hygroscopic species.

Realization a Text Independent Speaker Identification System with Frame Level Likelihood Normalization (프레임레벨유사도정규화를 적용한 문맥독립화자식별시스템의 구현)

  • 김민정;석수영;김광수;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.1
    • /
    • pp.8-14
    • /
    • 2002
  • In this paper, we realized a real-time text-independent speaker recognition system using gaussian mixture model, and applied frame level likelihood normalization method which shows its effects in verification system. The system has three parts as front-end, training, recognition. In front-end part, cepstral mean normalization and silence removal method were applied to consider speaker's speaking variations. In training, gaussian mixture model was used for speaker's acoustic feature modeling, and maximum likelihood estimation was used for GMM parameter optimization. In recognition, likelihood score was calculated with speaker models and test data at frame level. As test sentences, we used text-independent sentences. ETRI 445 and KLE 452 database were used for training and test, and cepstrum coefficient and regressive coefficient were used as feature parameters. The experiment results show that the frame-level likelihood method's recognition result is higher than conventional method's, independently the number of registered speakers.

  • PDF

Automatic Equalizer Control Method Using Music Genre Classification in Automobile Audio System (음악 장르 분류를 이용한 자동차 오디오 시스템에서의 이퀄라이저 자동 조절 방식)

  • Kim, Hyoung-Gook;Nam, Sang-Soon
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.8 no.4
    • /
    • pp.33-38
    • /
    • 2009
  • This paper proposes an automatic equalizer control method in automobile audio system. The proposed method discriminates the music segment from the consecutive real-time audio stream of the radio and the equalizer is controlled automatically according to the classified genre of the music segment. For enhancing the accuracy of the music genre classification in real-time, timbre feature and rhythm feature extracted from the consecutive audio stream is applied to GMM(Gaussian mixture model) classifier. The proposed method evaluates the performance of the music genre classification, which classified various audio segments segmented from the audio signal of the radio broadcast in automobile audio system into one of five music genres.

  • PDF