Search | Korea Science

Fast Speaker Identification Using a Universal Background Model Clustering Method (Universal Background Model 클러스터링 방법을 이용한 고속 화자식별)

Park, Jumin;Suh, Youngjoo;Kim, Hoirin
- The Journal of the Acoustical Society of Korea
- /
- v.33 no.3
- /
- pp.216-224
- /
- 2014
In this paper, we propose a new method to drastically reduce computational complexity in Gaussian Mixture Model (GMM)-based Speaker Identification (SI). Generally, GMM-based SI systems have very high computational complexity proportional to the length of the test utterance, the number of enrolled speakers, and the GMM size. These make the SI systems difficult to be used in various real applications in spite of their broad applicability. Thus, a trade-off between computational complexity and identification accuracy is considered as a primary issue for practical applications. In order to reduce computational complexity sharply with a little loss of accuracy, we introduce a method based on the Universal Background Model (UBM) clustering approach and then we show that it can be used successfully in real-time applications. In experiments with the proposed algorithm, we obtained a speed-up factor of 6 with a negligible loss of accuracy.
https://doi.org/10.7776/ASK.2014.33.3.216 인용 PDF KSCI

Background Subtraction based on GMM for Night-time Video Surveillance (야간 영상 감시를 위한 GMM기반의 배경 차분)

Yeo, Jung Yeon;Lee, Guee Sang
- Smart Media Journal
- /
- v.4 no.3
- /
- pp.50-55
- /
- 2015
In this paper, we present background modeling method based on Gaussian mixture model to subtract background for night-time video surveillance. In night-time video, it is hard work to distinguish the object from the background because a background pixel is similar to a object pixel. To solve this problem, we change the pixel of input frame to more advantageous value to make the Gaussian mixture model using scaled histogram stretching in preprocessing step. Using scaled pixel value of input frame, we then exploit GMM to find the ideal background pixelwisely. In case that the pixel of next frame is not included in any Gaussian, the matching test in old GMM method ignores the information of stored background by eliminating the Gaussian distribution with low weight. Therefore we consider the stacked data by applying the difference between the old mean and new pixel intensity to new mean instead of removing the Gaussian with low weight. Some experiments demonstrate that the proposed background modeling method shows the superiority of our algorithm effectively.
PDF KSCI

Context Recognition Using Environmental Sound for Client Monitoring System (피보호자 모니터링 시스템을 위한 환경음 기반 상황 인식)

Ji, Seung-Eun;Jo, Jun-Yeong;Lee, Chung-Keun;Oh, Siwon;Kim, Wooil
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.19 no.2
- /
- pp.343-350
- /
- 2015
This paper presents a context recognition method using environmental sound signals, which is applied to a mobile-based client monitoring system. Seven acoustic contexts are defined and the corresponding environmental sound signals are obtained for the experiments. To evaluate the performance of the context recognition, MFCC and LPCC method are employed as feature extraction, and statistical pattern recognition method are used employing GMM and HMM as acoustic models, The experimental results show that LPCC and HMM are more effective at improving context recognition accuracy compared to MFCC and GMM respectively. The recognition system using LPCC and HMM obtains 96.03% in recognition accuracy. These results demonstrate that LPCC is effective to represent environmental sounds which contain more various frequency components compared to human speech. They also prove that HMM is more effective to model the time-varying environmental sounds compared to GMM.
https://doi.org/10.6109/jkiice.2015.19.2.343 인용 PDF KSCI KPUBS HTML

Efficient Speaker Identification based on Robust VQ-PCA (강인한 VQ-PCA에 기반한 효율적인 화자 식별)

Lee Ki-Yong
- Journal of Internet Computing and Services
- /
- v.5 no.3
- /
- pp.57-62
- /
- 2004
In this paper, an efficient speaker identification based on robust vector quantizationprincipal component analysis (VQ-PCA) is proposed to solve the problems from outliers and high dimensionality of training feature vectors in speaker identification, Firstly, the proposed method partitions the data space into several disjoint regions by roust VQ based on M-estimation. Secondly, the robust PCA is obtained from the covariance matrix in each region. Finally, our method obtains the Gaussian Mixture model (GMM) for speaker from the transformed feature vectors with reduced dimension by the robust PCA in each region, Compared to the conventional GMM with diagonal covariance matrix, under the same performance, the proposed method gives faster results with less storage and, moreover, shows robust performance to outliers.
PDF

Generalized methods of moments in marginal models for longitudinal data with time-dependent covariates

Cho, Gyo-Young;Dashnyam, Oyunchimeg
- Journal of the Korean Data and Information Science Society
- /
- v.24 no.4
- /
- pp.877-883
- /
- 2013
The quadratic inference functions (QIF) method proposed by Qu et al. (2000) and the generalized method of moments (GMM) for marginal regression analysis of longitudinal data with time-dependent covariates proposed by Lai and Small (2007) both are the methods based on generalized method of moment (GMM) introduced by Hansen (1982) and both use generalized estimating equations (GEE). Lai and Small (2007) divided time-dependent covariates into three types such as: Type I, Type II and Type III. In this paper, we compared these methods in the case of Type II and Type III in which full covariates conditional mean assumption (FCCM) is violated and interested in whether they can improve the results of GEE with independence working correlation. We show that in the marginal regression model with Type II time-dependent covariates, GMM Type II of Lai and Small (2007) provides more ecient result than QIF and for the Type III time-dependent covariates, QIF with independence working correlation and GMM Type III methods provide the same results. Our simulation study showed the same results.
https://doi.org/10.7465/jkdi.2013.24.4.877 인용 PDF KSCI

Speaker Identification Using PCA Fuzzy Mixture Model (PCA 퍼지 혼합 모델을 이용한 화자 식별)

Lee, Ki-Yong
- Speech Sciences
- /
- v.10 no.4
- /
- pp.149-157
- /
- 2003
In this paper, we proposed the principal component analysis (PCA) fuzzy mixture model for speaker identification. A PCA fuzzy mixture model is derived from the combination of the PCA and the fuzzy version of mixture model with diagonal covariance matrices. In this method, the feature vectors are first transformed by each speaker's PCA transformation matrix to reduce the correlation among the elements. Then, the fuzzy mixture model for speaker is obtained from these transformed feature vectors with reduced dimensions. The orthogonal Gaussian Mixture Model (GMM) can be derived as a special case of PCA fuzzy mixture model. In our experiments, with having the number of mixtures equal, the proposed method requires less training time and less storage as well as shows better speaker identification rate compared to the conventional GMM. Also, the proposed one shows equal or better identification performance than the orthogonal GMM does.
PDF

Hybrid Method using Frame Selection and Weighting Model Rank to improve Performance of Real-time Text-Independent Speaker Recognition System based on GMM (GMM 기반 실시간 문맥독립화자식별시스템의 성능향상을 위한 프레임선택 및 가중치를 이용한 Hybrid 방법)

김민정;석수영;김광수;정호열;정현열
- Journal of Korea Multimedia Society
- /
- v.5 no.5
- /
- pp.512-522
- /
- 2002
In this paper, we propose a hybrid method which is mixed with frame selection and weighting model rank method, based on GMM(gaussian mixture model), for real-time text-independent speaker recognition system. In the system, maximum likelihood estimation was used for GMM parameter optimization, and maximum likelihood was used for recognition basically Proposed hybrid method has two steps. First, likelihood score was calculated with speaker models and test data at frame level, and the difference is calculated between the biggest likelihood value and second. And then, the frame is selected if the difference is bigger than threshold. The second, instead of calculated likelihood, weighting value is used for calculating total score at each selected frame. Cepstrum coefficient and regressive coefficient were used as feature parameters, and the database for test and training consists of several data which are collected at different time, and data for experience are selected randomly In experiments, we applied each method to baseline system, and tested. In speaker recognition experiments, proposed hybrid method has an average of 4% higher recognition accuracy than frame selection method and 1% higher than W method, implying the effectiveness of it.
PDF

Identification of Superior Single Nucleotide Polymorphisms (SNP) Combinations Related to Economic Traits by Genotype Matrix Mapping (GMM) in Hanwoo (Korean Cattle)

Lee, Yoon-Seok;Oh, Dong-Yep;Lee, Yong-Won;Yeo, Jung-Sou;Lee, Jea-Young
- Asian-Australasian Journal of Animal Sciences
- /
- v.24 no.11
- /
- pp.1504-1513
- /
- 2011
It is important to identify genetic interactions related to human diseases or animal traits. Many linear statistical models have been reported but they did not consider genetic interactions. Genotype matrix mapping (GMM) has been developed to identify genetic interactions. This study uses the GMM method to detect superior SNP combinations of the CCDC158 gene that influences average daily gain, marbling score, cold carcass weight and longissimus muscle dorsi area traits in Hanwoo. We evaluated the statistical significance of the major SNP combinations selected by implementing the permutation test of the F-measure. The effect of g.34425+102 A>T (AA), g.8778G>A (GG) and g.4102+36T>G (GT) SNP combinations produced higher performance of average daily gain, marbling score, cold carcass weight and the longissimus muscle dorsi area traits than the effect of a single SNP. GMM is a fast and reliable method for multiple SNP analysis with potential application in marker-assisted selection. GMM may prospectively be used for genetic assessment of quantitative traits after further development.
https://doi.org/10.5713/ajas.2011.11112 인용 PDF KSCI

Effective Recognition of Velopharyngeal Insufficiency (VPI) Patient's Speech Using DNN-HMM-based System (DNN-HMM 기반 시스템을 이용한 효과적인 구개인두부전증 환자 음성 인식)

Yoon, Ki-mu;Kim, Wooil
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.23 no.1
- /
- pp.33-38
- /
- 2019
This paper proposes an effective recognition method of VPI patient's speech employing DNN-HMM-based speech recognition system, and evaluates the recognition performance compared to GMM-HMM-based system. The proposed method employs speaker adaptation technique to improve VPI speech recognition. This paper proposes to use simulated VPI speech for generating a prior model for speaker adaptation and selective learning of weight matrices of DNN, in order to effectively utilize the small size of VPI speech for model adaptation. We also apply Linear Input Network (LIN) based model adaptation technique for the DNN model. The proposed speaker adaptation method brings 2.35% improvement in average accuracy compared to GMM-HMM based ASR system. The experimental results demonstrate that the proposed DNN-HMM-based speech recognition system is effective for VPI speech with small-sized speech data, compared to conventional GMM-HMM system.
https://doi.org/10.6109/jkiice.2019.23.1.33 인용 PDF KSCI HTML

Analysis and Implementation of Speech/Music Classification for 3GPP2 SMV Based on GMM (3GPP2 SMV의 실시간 음성/음악 분류 성능 향상을 위한 Gaussian Mixture Model의 적용)

Song, Ji-Hyun;Lee, Kye-Hwan;Chang, Joon-Hyuk
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.8
- /
- pp.390-396
- /
- 2007
In this letter, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder(SMV) of 3GPP2 using the Gaussian mixture model(GMM) which is based on the expectation-maximization(EM) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then feature vectors which are applied to the GMM are selected from relevant Parameters of the SMV for the efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.
https://doi.org/10.7776/ASK.2007.26.8.390 인용 PDF KSCI

Search Result 298, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)