• Title/Summary/Keyword: System GMM Model

Search Result 143, Processing Time 0.026 seconds

Semi-supervised domain adaptation using unlabeled data for end-to-end speech recognition (라벨이 없는 데이터를 사용한 종단간 음성인식기의 준교사 방식 도메인 적응)

  • Jeong, Hyeonjae;Goo, Jahyun;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.29-37
    • /
    • 2020
  • Recently, the neural network-based deep learning algorithm has dramatically improved performance compared to the classical Gaussian mixture model based hidden Markov model (GMM-HMM) automatic speech recognition (ASR) system. In addition, researches on end-to-end (E2E) speech recognition systems integrating language modeling and decoding processes have been actively conducted to better utilize the advantages of deep learning techniques. In general, E2E ASR systems consist of multiple layers of encoder-decoder structure with attention. Therefore, E2E ASR systems require data with a large amount of speech-text paired data in order to achieve good performance. Obtaining speech-text paired data requires a lot of human labor and time, and is a high barrier to building E2E ASR system. Therefore, there are previous studies that improve the performance of E2E ASR system using relatively small amount of speech-text paired data, but most studies have been conducted by using only speech-only data or text-only data. In this study, we proposed a semi-supervised training method that enables E2E ASR system to perform well in corpus in different domains by using both speech or text only data. The proposed method works effectively by adapting to different domains, showing good performance in the target domain and not degrading much in the source domain.

Livestock Theft Detection System Using Skeleton Feature and Color Similarity (골격 특징 및 색상 유사도를 이용한 가축 도난 감지 시스템)

  • Kim, Jun Hyoung;Joo, Yung Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.67 no.4
    • /
    • pp.586-594
    • /
    • 2018
  • In this paper, we propose a livestock theft detection system through moving object classification and tracking method. To do this, first, we extract moving objects using GMM(Gaussian Mixture Model) and RGB background modeling method. Second, it utilizes a morphology technique to remove shadows and noise, and recognizes moving objects through labeling. Third, the recognized moving objects are classified into human and livestock using skeletal features and color similarity judgment. Fourth, for the classified moving objects, CAM (Continuously Adaptive Meanshift) Shift and Kalman Filter are used to perform tracking and overlapping judgment, and risk is judged to generate a notification. Finally, several experiments demonstrate the feasibility and applicability of the proposed method.

Classification of Phornographic Videos Based on the Audio Information (오디오 신호에 기반한 음란 동영상 판별)

  • Kim, Bong-Wan;Choi, Dae-Lim;Lee, Yong-Ju
    • MALSORI
    • /
    • no.63
    • /
    • pp.139-151
    • /
    • 2007
  • As the Internet becomes prevalent in our lives, harmful contents, such as phornographic videos, have been increasing on the Internet, which has become a very serious problem. To prevent such an event, there are many filtering systems mainly based on the keyword-or image-based methods. The main purpose of this paper is to devise a system that classifies pornographic videos based on the audio information. We use the mel-cepstrum modulation energy (MCME) which is a modulation energy calculated on the time trajectory of the mel-frequency cepstral coefficients (MFCC) as well as the MFCC as the feature vector. For the classifier, we use the well-known Gaussian mixture model (GMM). The experimental results showed that the proposed system effectively classified 98.3% of pornographic data and 99.8% of non-pornographic data. We expect the proposed method can be applied to the more accurate classification system which uses both video and audio information.

  • PDF

Performance assessments of feature vectors and classification algorithms for amphibian sound classification (양서류 울음 소리 식별을 위한 특징 벡터 및 인식 알고리즘 성능 분석)

  • Park, Sangwook;Ko, Kyungdeuk;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.6
    • /
    • pp.401-406
    • /
    • 2017
  • This paper presents the performance assessment of several key algorithms conducted for amphibian species sound classification. Firstly, 9 target species including endangered species are defined and a database of their sounds is built. For performance assessment, three feature vectors such as MFCC (Mel Frequency Cepstral Coefficient), RCGCC (Robust Compressive Gammachirp filterbank Cepstral Coefficient), and SPCC (Subspace Projection Cepstral Coefficient), and three classifiers such as GMM(Gaussian Mixture Model), SVM(Support Vector Machine), DBN-DNN(Deep Belief Network - Deep Neural Network) are considered. In addition, i-vector based classification system which is widely used for speaker recognition, is used to assess for this task. Experimental results indicate that, SPCC-SVM achieved the best performance with 98.81 % while other methods also attained good performance with above 90 %.

Analysis of Decoupling Phenomenon Between Economic Growth and GHG Emissions: Dynamic Panel Analysis of 63 Countries (1980~2014) (경제성장과 탄소배출량의 탈동조화 현상 분석: 63개국 동태패널분석(1980~2014년))

  • Lim, Hyungwoo;Jo, Ha-hyun
    • Environmental and Resource Economics Review
    • /
    • v.28 no.4
    • /
    • pp.497-526
    • /
    • 2019
  • The importance of "decoupling" to maintain economic growth and reduce greenhouse gases is emerging as the world has been mandated to reduce greenhouse gases since the 2015 Paris Agreement. This study covered 63 countries from 1980 to 2014 and analyzed the main characteristics and causes of decoupling phenomenon between economic growth and carbon emissions. In this study, the degree of decoupling was measured every five years. The analysis found that the decoupling rate of OECD countries and countries with large incomes was high, and that the decoupling phenomenon has accelerated worldwide since the 2000s. However, the degree of decoupling was different depending on the national characteristics. According to the results of dynamic panel model, the growth rate of manufacturing and the proportion of exports hampered decoupling, while the proportion of human capital and renewable energy had a positive effect on decoupling. Also income had a inverse U-shape non-linear effect on decoupling.

Detection and Recognition of Illegally Parked Vehicles Based on an Adaptive Gaussian Mixture Model and a Seed Fill Algorithm

  • Sarker, Md. Mostafa Kamal;Weihua, Cai;Song, Moon Kyou
    • Journal of information and communication convergence engineering
    • /
    • v.13 no.3
    • /
    • pp.197-204
    • /
    • 2015
  • In this paper, we present an algorithm for the detection of illegally parked vehicles based on a combination of some image processing algorithms. A digital camera is fixed in the illegal parking region to capture the video frames. An adaptive Gaussian mixture model (GMM) is used for background subtraction in a complex environment to identify the regions of moving objects in our test video. Stationary objects are detected by using the pixel-level features in time sequences. A stationary vehicle is detected by using the local features of the object, and thus, information about illegally parked vehicles is successfully obtained. An automatic alarm system can be utilized according to the different regulations of different illegal parking regions. The results of this study obtained using a test video sequence of a real-time traffic scene show that the proposed method is effective.

Compromised feature normalization method for deep neural network based speech recognition (심층신경망 기반의 음성인식을 위한 절충된 특징 정규화 방식)

  • Kim, Min Sik;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.65-71
    • /
    • 2020
  • Feature normalization is a method to reduce the effect of environmental mismatch between the training and test conditions through the normalization of statistical characteristics of acoustic feature parameters. It demonstrates excellent performance improvement in the traditional Gaussian mixture model-hidden Markov model (GMM-HMM)-based speech recognition system. However, in a deep neural network (DNN)-based speech recognition system, minimizing the effects of environmental mismatch does not necessarily lead to the best performance improvement. In this paper, we attribute the cause of this phenomenon to information loss due to excessive feature normalization. We investigate whether there is a feature normalization method that maximizes the speech recognition performance by properly reducing the impact of environmental mismatch, while preserving useful information for training acoustic models. To this end, we introduce the mean and exponentiated variance normalization (MEVN), which is a compromise between the mean normalization (MN) and the mean and variance normalization (MVN), and compare the performance of DNN-based speech recognition system in noisy and reverberant environments according to the degree of variance normalization. Experimental results reveal that a slight performance improvement is obtained with the MEVN over the MN and the MVN, depending on the degree of variance normalization.

Corrosion Image Monitoring of steel plate by using k-means clustering (k-means 클러스터링을 이용한 강판의 부식 이미지 모니터링)

  • Kim, Beomsoo;Kwon, Jaesung;Choi, Sungwoong;Noh, Jungpil;Lee, Kyunghwang;Yang, Jeonghyeon
    • Journal of the Korean institute of surface engineering
    • /
    • v.54 no.5
    • /
    • pp.278-284
    • /
    • 2021
  • Corrosion of steel plate is common phenomenon which results in the gradual destruction caused by a wide variety of environments. Corrosion monitoring is the tracking of the degradation progress for a long period of time. Corrosion on steel plate appears as a discoloration and any irregularities on the surface. In this study, we developed a quantitative evaluation method of the rust formed on steel plate by using k-means clustering from the corroded area in a given image. The k-means clustering for automated corrosion detection was based on the GrabCut segmentation and Gaussian mixture model(GMM). Image color of the corroded surface at cut-edge area was analyzed quantitatively based on HSV(Hue, Saturation, Value) color space.

How Does Foreign Direct Investment Affect Unbundled Institution? (외국인 직접투자는 제도에 어떻게 영향을 미치는가?)

  • Suh, Hanseok
    • International Area Studies Review
    • /
    • v.15 no.3
    • /
    • pp.535-558
    • /
    • 2011
  • Based on the Rodrik's four-way partition of institutions; market creating, market regulatory institution, market stabilizaing and market legitimizing institution, we analyze how FDI and interaction between FDI and democracy affect four kinds of institutions. By using fixed effect and system GMM model we estimate the direct and indirect effect of FDI on institutions within a large panel data set of 186 developing and developed countries for the period 1985-2009. We show that FDI inflows do not have a positive and significant impact on most kinds of institutions while interaction between democracy and FDI inflows have a significant and positive effect on market creating, market legitimizing and market stabilizing institution. The implication is FDI inflow does not directly lead to change the quality of institution but can indirectly improve it on the condition that democracy of host country become mature. To our knowledge this is the first article to empirically test the FDI and four-way unbundled institutions linkages.

Bank Restructuring and Financial Performance: A Case Study of Commercial Banks in Vietnam

  • DUONG, Tam Thanh Nguyen;NGUYEN, Hoa Quynh
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.8 no.10
    • /
    • pp.327-339
    • /
    • 2021
  • This study examines the impact of bank restructuring on the financial performance of commercial banks in Vietnam. The data for this study was obtained from the audited financial statements of 30 Vietnamese commercial banks from 2007 to 2019. Multiple regression analysis was used for investigation. Financial performance, as evaluated by ROAA, ROEA, and NIM, is the dependent variable. Financial restructuring, ownership restructuring, and operational restructuring are the independent variables. Pooled least squares (Pooled OLS), fixed effects model (FEM), random effects model (REM), and system generalized moment regression model (System GMM) are the estimate methods used to increase the accuracy of the regression coefficient. The research results show that the variables of financial restructuring activities such as government intervention and the ratio of equity to total assets; variables of ownership restructuring such as capital adequacy ratio, privatization of state-owned commercial banks, mergers, and acquisitions; variables of operational restructuring such as employees, branches, the cost to total assets; GDP variables and the second restructuring period have a positive impact on financial performance. Variables such as debt-to-capital ratio, bad debt ratio, state ownership ratio, expense-income ratio, and inflation have a negative effect on financial performance.