• Title/Summary/Keyword: System GMM Estimation

Search Result 19, Processing Time 0.041 seconds

Hybrid Method using Frame Selection and Weighting Model Rank to improve Performance of Real-time Text-Independent Speaker Recognition System based on GMM (GMM 기반 실시간 문맥독립화자식별시스템의 성능향상을 위한 프레임선택 및 가중치를 이용한 Hybrid 방법)

  • 김민정;석수영;김광수;정호열;정현열
    • Journal of Korea Multimedia Society
    • /
    • v.5 no.5
    • /
    • pp.512-522
    • /
    • 2002
  • In this paper, we propose a hybrid method which is mixed with frame selection and weighting model rank method, based on GMM(gaussian mixture model), for real-time text-independent speaker recognition system. In the system, maximum likelihood estimation was used for GMM parameter optimization, and maximum likelihood was used for recognition basically Proposed hybrid method has two steps. First, likelihood score was calculated with speaker models and test data at frame level, and the difference is calculated between the biggest likelihood value and second. And then, the frame is selected if the difference is bigger than threshold. The second, instead of calculated likelihood, weighting value is used for calculating total score at each selected frame. Cepstrum coefficient and regressive coefficient were used as feature parameters, and the database for test and training consists of several data which are collected at different time, and data for experience are selected randomly In experiments, we applied each method to baseline system, and tested. In speaker recognition experiments, proposed hybrid method has an average of 4% higher recognition accuracy than frame selection method and 1% higher than W method, implying the effectiveness of it.

  • PDF

Performance Comparison of GMM and HMM Approaches for Bandwidth Extension of Speech Signals (음성신호의 대역폭 확장을 위한 GMM 방법 및 HMM 방법의 성능평가)

  • Song, Geun-Bae;Kim, Austin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.3
    • /
    • pp.119-128
    • /
    • 2008
  • This paper analyzes the relationship between two representative statistical methods for bandwidth extension (BWE): Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) ones, and compares their performances. The HMM method is a memory-based system which was developed to take advantage of the inter-frame dependency of speech signals. Therefore, it could be expected to estimate better the transitional information of the original spectra from frame to frame. To verify it, a dynamic measure that is an approximation of the 1st-order derivative of spectral function over time was introduced in addition to a static measure. The comparison result shows that the two methods are similar in the static measure, while, in the dynamic measure, the HMM method outperforms explicitly the GMM one. Moreover, this difference increases in proportion to the number of states of HMM model. This indicates that the HMM method would be more appropriate at least for the 'blind BWE' problem. On the other hand, nevertheless, the GMM method could be treated as a preferable alternative of the HMM one in some applications where the static performance and algorithm complexity are critical.

The Effect of Trade Agreements on Korea's Bilateral Trade Volume: Mitigating the Impact of Economic Uncertainty in Trading Countries

  • Heedae Park;Jiyoung An
    • Journal of Korea Trade
    • /
    • v.27 no.5
    • /
    • pp.153-166
    • /
    • 2023
  • Purpose - This research empirically analyzes the influence of economic policy uncertainty and free trade agreements (FTAs) on bilateral trade volumes between Korea and its trading partners. The study investigates whether fluctuations in the Economic Policy Uncertainty Index (EPUI) for both Korea and its trading partners significantly impact trade volumes and whether the implementation of FTAs mitigates these effects. Design/methodology - The study employs dynamic panel data analysis using the system generalized method of moments (system GMM) estimation method to achieve its research objectives. It utilizes country-month-level panel data, including the EPUI, trade volume between Korea and its trading partner countries, and other pertinent variables. The use of system GMM allows for the control of potential endogeneity issues and the incorporation of country-specific and time-specific effects. Findings - The analysis yields significant results regarding the impact of economic policy uncertainty on Korea's exports and imports, particularly before the implementation of FTAs. An increase in the EPUI of trading partners leads to a notable increase in Korea's exports to them. Conversely, an increase in Korea's EPUI negatively affects its imports from trading partners. However, post-FTA implementation, the influence of each country's EPUI on trade volume is neutralized, with no significant difference observed. Originality/value - This research contributes to the existing literature by providing empirical evidence on the interaction effects between economic policy uncertainty and FTAs on bilateral trade volumes. The study's uniqueness lies in its examination of how FTAs mitigate the impact of economic uncertainty on trade relations between countries. The findings underscore the importance of trade agreements as mechanisms to address economic risks and promote international trade relations. In a world where global market uncertainties persist, these insights can aid policymakers in Korea and other countries in enhancing their trade cooperation strategies and navigating challenges posed by evolving economic landscapes.

Factors Affecting Liquidity Risks of Joint Stock Commercial Banks in Vietnam

  • NGUYEN, Hoang Chung
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.4
    • /
    • pp.197-212
    • /
    • 2022
  • The study uses the audited financial statements of 26 Vietnamese commercial banks listed on the Ho Chi Minh City Stock Exchange (HOSE) and Hanoi Stock Exchange (HOSE) during the 2008-2018 period to estimate the system GMM model, which provides empirical evidence on the effect of the variables of customer deposit to total assets (DEPO) ratio, loan to assets (LTA) ratio, liquidity of commercial banks (LIQ), credit development (CRD) ratio, external funding (EFD) ratio, and credit loss provision (LLP) ratio on liquidity risk. The study confirms that commercial banks' internal factors play the most important role, and there is no empirical evidence on macro variables that affect liquidity risk. Finally, in accordance with the theoretical framework, the study uses an estimation method with the R language and the bootstrap methodology to give empirical proof of the nonlinear correlation and U-shaped graph between commercial bank size and liquidity risk. The importance of commercial bank size in absorbing and moderating the effects of liquidity shocks is demonstrated, however, excessive growth in commercial bank size would increase liquidity risk in commercial bank operations.

Realization a Text Independent Speaker Identification System with Frame Level Likelihood Normalization (프레임레벨유사도정규화를 적용한 문맥독립화자식별시스템의 구현)

  • 김민정;석수영;김광수;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.1
    • /
    • pp.8-14
    • /
    • 2002
  • In this paper, we realized a real-time text-independent speaker recognition system using gaussian mixture model, and applied frame level likelihood normalization method which shows its effects in verification system. The system has three parts as front-end, training, recognition. In front-end part, cepstral mean normalization and silence removal method were applied to consider speaker's speaking variations. In training, gaussian mixture model was used for speaker's acoustic feature modeling, and maximum likelihood estimation was used for GMM parameter optimization. In recognition, likelihood score was calculated with speaker models and test data at frame level. As test sentences, we used text-independent sentences. ETRI 445 and KLE 452 database were used for training and test, and cepstrum coefficient and regressive coefficient were used as feature parameters. The experiment results show that the frame-level likelihood method's recognition result is higher than conventional method's, independently the number of registered speakers.

  • PDF

Improving A Text Independent Speaker Identification System By Frame Level Likelihood Normalization (프레임단위유사도정규화를 이용한 문맥독립화자식별시스템의 성능 향상)

  • 김민정;석수영;정현열;정호열
    • Proceedings of the IEEK Conference
    • /
    • 2001.09a
    • /
    • pp.487-490
    • /
    • 2001
  • 본 논문에서는 기존의 Caussian Mixture Model을 이용한 실시간문맥독립화자인식시스템의 성능을 향상시키기 위하여 화자검증시스템에서 좋은 결과를 나타내는 유사도정규화 ( Likelihood Normalization )방법을 화자식별시스템에 적용하여 시스템을 구현하였으며, 인식실험한 결과에 대해 보고한다. 시스템은 화자모델생성단과 화자식별단으로 구성하였으며, 화자모델생성단에서는, 화자발성의 음향학적 특징을 잘 표현할 수 있는 GMM(Gaussian Mixture Model)을 이용하여 화자모델을 작성하였으며. GMM의 파라미터를 최적화하기 위하여 MLE(Maximum Likelihood Estimation)방법을 사용하였다. 화자식별단에서는 학습된 데이터와 테스트용 데이터로부터 ML(Maximum Likelihood)을 이용하여 프레임단위로 유사도를 계산하였다. 계산된 유사도는 유사도 정규화 과정을 거쳐 스코어( SC)로 표현하였으며, 가장 높은 스코어를 가지는 화자를 인식화자로 결정한다. 화자인식에서 발성의 종류로는 문맥독립 문장을 사용하였다. 인식실험을 위해서는 ETRI445 DB와 KLE452 DB를 사용하였으며. 특징파라미터로서는 켑스트럼계수 및 회귀계수값만을 사용하였다. 인식실험에서는 등록화자의 수를 달리하여 일반적인 화자식별방법과 프레임단위유사도정규화방법으로 각각 인식실험을 하였다. 인식실험결과, 프레임단위유사도정규화방법이 인식화자수가 많아지는 경우에 일반적인 방법보다 향상된 인식률을 얻을수 있었다.

  • PDF

A Neuro-Fuzzy System Modeling using Gaussian Mixture Model and Clustering Method (GMM과 클러스터링 기법에 의한 뉴로-퍼지 시스템 모델링)

  • Kim, Sung-Suk;Kwak, Keun-Chang;Ryu, Jeong-Woong;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.12 no.6
    • /
    • pp.571-576
    • /
    • 2002
  • There have been a lot of considerations dealing with improving the performance of neuro-fuzzy system. The studies on the neuro-fuzzy modeling have largely been devoted to two approaches. First is to improve performance index of system. The other is to reduce the structure size. In spite of its satisfactory result, it should be noted that these are difficult to extend to high dimensional input or to increase the membership functions. We propose a novel neuro-fuzzy system based on the efficient clustering method for initializing the parameters of the premise part. It is a very useful method that maintains a few number of rules and improves the performance. It combine the various algorithms to improve the performance. The Expectation-Maximization algorithm of Gaussian mixture model is an efficient estimation method for unknown parameter estimation of mirture model. The obtained parameters are used for fuzzy clustering method. The proposed method satisfies these two requirements using the Gaussian mixture model and neuro-fuzzy modeling. Experimental results indicate that the proposed method is capable of giving reliable performance.

Height Estimation of pedestrian based on image (영상기반 보행자 키 추정 방법)

  • Kim, Sung-Min;Song, Jong-Kwan;Yoon, Byung-Woo;Park, Jang-Sik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.9
    • /
    • pp.1035-1042
    • /
    • 2014
  • Object recognition is one of the key technologies of the monitoring system for the prevention of various intelligent crimes. The height is one of the physical information of a person, and it may be important information for identification of the person. In this paper, a method which can detect pedestrians from CCTV images and estimate the height of the detected objects, is proposed. In this method, GMM (Gaussian Mixture Model) method was used to separate the moving object from the background and the pedestrian was detected using the conditions such as the width-height ratio and the size of the candidate objects. The proposed method was applied to the CCTV video, and the height of the pedestrian at far-distance, middle- distance, near-distance was estimated for the same person, and the accuracy was evaluated. Experimental results showed that the proposed method can estimate the height of the pedestrian as the accuracy of 97% for the short-range, 98% for the medium-range, and more than 97% for the far-range. The image sizes for the same pedestrian are different as the position of him in the image, it is shown that the proposed algorithm can estimate the height of pedestrian for various position effectively.

Safety Robust Speaker Recognition Against Utterance Variationsed (발성변화에 강인한 화자 인식에 관한 연구)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.2
    • /
    • pp.69-73
    • /
    • 2004
  • A speaker model In speaker recognition system is to be trained from a large data set gathered in multiple sessions. Large data set requires large amount of memory and computation, and moreover it's practically hard to make users utter the data inseveral sessions. Recently the incremental adaptation methods are proposed to cover the problems, However, the data set gathered from multiple sessions is vulnerable to the outliers from the irregular utterance variations and the presence of noise, which result in inaccurate speaker model. In this paper, we propose an incremental robust adaptation method to minimize the influence of outliers on Gaussian Mixture Madel based speaker model. The robust adaptation is obtained from an incremental version of M-estimation. Speaker model is initially trained from small amount of data and it is adapted recursively with the data available in each session, Experimental results from the data set gathered over seven months show that the proposed method is robust against outliers.

  • PDF

Noise Elimination Using Improved MFCC and Gaussian Noise Deviation Estimation

  • Sang-Yeob, Oh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.1
    • /
    • pp.87-92
    • /
    • 2023
  • With the continuous development of the speech recognition system, the recognition rate for speech has developed rapidly, but it has a disadvantage in that it cannot accurately recognize the voice due to the noise generated by mixing various voices with the noise in the use environment. In order to increase the vocabulary recognition rate when processing speech with environmental noise, noise must be removed. Even in the existing HMM, CHMM, GMM, and DNN applied with AI models, unexpected noise occurs or quantization noise is basically added to the digital signal. When this happens, the source signal is altered or corrupted, which lowers the recognition rate. To solve this problem, each voice In order to efficiently extract the features of the speech signal for the frame, the MFCC was improved and processed. To remove the noise from the speech signal, the noise removal method using the Gaussian model applied noise deviation estimation was improved and applied. The performance evaluation of the proposed model was processed using a cross-correlation coefficient to evaluate the accuracy of speech. As a result of evaluating the recognition rate of the proposed method, it was confirmed that the difference in the average value of the correlation coefficient was improved by 0.53 dB.