• 제목/요약/키워드: GMM System

Search Result 194, Processing Time 0.158 seconds

An Implementation of Automatic Genre Classification System for Korean Traditional Music (한국 전통음악 (국악)에 대한 자동 장르 분류 시스템 구현)

  • Lee Kang-Kyu;Yoon Won-Jung;Park Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.1
    • /
    • pp.29-37
    • /
    • 2005
  • This paper proposes an automatic genre classification system for Korean traditional music. The Proposed system accepts and classifies queried input music as one of the six musical genres such as Royal Shrine Music, Classcal Chamber Music, Folk Song, Folk Music, Buddhist Music, Shamanist Music based on music contents. In general, content-based music genre classification consists of two stages - music feature vector extraction and Pattern classification. For feature extraction. the system extracts 58 dimensional feature vectors including spectral centroid, spectral rolloff and spectral flux based on STFT and also the coefficient domain features such as LPC, MFCC, and then these features are further optimized using SFS method. For Pattern or genre classification, k-NN, Gaussian, GMM and SVM algorithms are considered. In addition, the proposed system adopts MFC method to settle down the uncertainty problem of the system performance due to the different query Patterns (or portions). From the experimental results. we verify the successful genre classification performance over $97{\%}$ for both the k-NN and SVM classifier, however SVM classifier provides almost three times faster classification performance than the k-NN.

Drought risk assessment considering regional socio-economic factors and water supply system (지역의 사회·경제적 인자와 용수공급체계를 고려한 가뭄 위험도 평가)

  • Kim, Ji Eun;Kim, Min Ji;Choi, Sijung;Lee, Joo-Heon;Kim, Tae-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.8
    • /
    • pp.589-601
    • /
    • 2022
  • Although drought is a natural phenomenon, its damage occurs in combination with regional physical and social factors. Especially, related to the supply and demand of various waters, drought causes great socio-economic damage. Even meteorological droughts occur with similar severity, its impact varies depending on the regional characteristics and water supply system. Therefore, this study assessed regional drought risk considering regional socio-economic factors and water supply system. Drought hazard was assessed by grading the joint drought management index (JDMI) which represents water shortage. Drought vulnerability was assessed by weighted averaging 10 socio-economic factors using Entropy, Principal Component Analysis (PCA), and Gaussian Mixture Model (GMM). Drought response capacity that represents regional water supply factors was assessed by employing Bayesian networks. Drought risk was determined by multiplying a cubic root of the hazard, vulnerability, and response capacity. For the drought hazard meaning the possibility of failure to supply water, Goesan-gun was the highest at 0.81. For the drought vulnerability, Daejeon was most vulnerable at 0.61. Considering the regional water supply system, Sejong had the lowest drought response capacity. Finally, the drought risk was the highest in Cheongju-si. This study identified the regional drought risk and vulnerable causes of drought, which is useful in preparing drought mitigation policy considering the regional characteristics in the future.

Confidence Measure of Forensic Speaker Identification System According to Pitch Variances (과학수사용 화자 식별 시스템의 피치 차이에 따른 신뢰성 척도)

  • Kim, Min-Seok;Kim, Kyung-Wha;Yang, IL-Ho;Yu, Ha-Jin
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.135-139
    • /
    • 2010
  • Forensic speaker identification needs high accuracy and reliability. However, the current level of speaker identification does not reach its demand. Therefore, the confidence evaluation of results is one of the issues in forensic speaker identification. In this paper, we propose a new confidence measure of forensic speaker identification system. This is based on pitch differences between the registered utterances of the identified speaker and the test utterance. In the experiments, we evaluate this confidence measure by speech identification tasks on various environments. As the results, the proposed measure can be a good measure indicating if the result is reliable or not.

  • PDF

On the Simple Speaker Verification System Using Tolerance Interval Analysis Without Background Speaker Models (Tolerance Interval Analysis를 이용한 배경화자 없는 간단한 화자인증시스템에 관한 연구)

  • Choi, Hong-Sub
    • MALSORI
    • /
    • no.56
    • /
    • pp.147-158
    • /
    • 2005
  • In this paper, we are focused to develop the simplified speaker verification algorithm without background speaker models, which will be adopted in the portable speaker verification system equipped in portable terminals such as mobile phone and PMP. According to the tolerance interval analysis, the population of someone's speaker model can be represented by a suitable number of selected independent samples of speaker model. So we can make the representative speaker model and threshold under the specified confidence level and coverage. Using proposed algorithm with the number of samples is 40, the experiments show that the false rejection rate is $3.0\%$ and the false acceptance rate $4.3\%$, worth comparing to conventional method's results, $5.4\%\;and\;5.5\%$, respectively. Next step of research will be on the suitable adaptation methods to overcome speech variation problems due to aging effect and operating environments.

  • PDF

An Intelligent Automatic Early Detection System of Forest Fire Smoke Signatures using Gaussian Mixture Model

  • Yoon, Seok-Hwan;Min, Joonyoung
    • Journal of Information Processing Systems
    • /
    • v.9 no.4
    • /
    • pp.621-632
    • /
    • 2013
  • The most important things for a forest fire detection system are the exact extraction of the smoke from image and being able to clearly distinguish the smoke from those with similar qualities, such as clouds and fog. This research presents an intelligent forest fire detection algorithm via image processing by using the Gaussian Mixture model (GMM), which can be applied to detect smoke at the earliest time possible in a forest. GMMs are usually addressed by making the model adaptive so that its parameters can track changing illuminations and by making the model more complex so that it can represent multimodal backgrounds more accurately for smoke plume segmentation in the forest. Also, in this paper, we suggest a way to classify the smoke plumes via a feature extraction using HSL(Hue, Saturation and Lightness or Luminanace) color space analysis.

Emotional Speaker Recognition using Emotional Adaptation (감정 적응을 이용한 감정 화자 인식)

  • Kim, Weon-Goo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.7
    • /
    • pp.1105-1110
    • /
    • 2017
  • Speech with various emotions degrades the performance of the speaker recognition system. In this paper, a speaker recognition method using emotional adaptation has been proposed to improve the performance of speaker recognition system using affective speech. For emotional adaptation, emotional speaker model was generated from speaker model without emotion using a small number of training affective speech and speaker adaptation method. Since it is not easy to obtain a sufficient affective speech for training from a speaker, it is very practical to use a small number of affective speeches in a real situation. The proposed method was evaluated using a Korean database containing four emotions. Experimental results show that the proposed method has better performance than conventional methods in speaker verification and speaker recognition.

VTG based Moving Target Tracking Performance Improvement Method using MITL System in a Maritime Environment (해상환경에서 MITL 시스템을 활용한 VTG 기반 기동표적 추적성능 개선 기법)

  • Baek, Inhye;Woo, S.H. Arman
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.3
    • /
    • pp.357-365
    • /
    • 2019
  • In this paper, we suggest the tracking method of moving multi-objects in maritime environments. The image acquisition is conducted using IR(InfraRed) camera sensors on an airborne platform. Under the circumstance of maritime, the qualities of IR images can be significantly degraded due to the clutter influence, which directly gives rise to a tracking loss problem. In order to reduce the effects from the clutters, we introduce a technical approach under Man-In-The-Loop(MITL) system for enhancing the tracking performance. To demonstrate the robustness of the proposed approach based on VTG(Valid Tracking Gate), the simulations are conducted utilizing the airborne IR video sequences: Then, the tracking performances are compared with the existing Kalman Filter tracking techniques.

Machine Learning Model for Low Frequency Noise and Bias Temperature Instability (저주파 노이즈와 BTI의 머신 러닝 모델)

  • Kim, Yongwoo;Lee, Jonghwan
    • Journal of the Semiconductor & Display Technology
    • /
    • v.19 no.4
    • /
    • pp.88-93
    • /
    • 2020
  • Based on the capture-emission energy (CEE) maps of CMOS devices, a physics-informed machine learning model for the bias temperature instability (BTI)-induced threshold voltage shifts and low frequency noise is presented. In order to incorporate physics theories into the machine learning model, the integration of artificial neural network (IANN) is employed for the computation of the threshold voltage shifts and low frequency noise. The model combines the computational efficiency of IANN with the optimal estimation of Gaussian mixture model (GMM) with soft clustering. It enables full lifetime prediction of BTI under various stress and recovery conditions and provides accurate prediction of the dynamic behavior of the original measured data.

Performance assessments of feature vectors and classification algorithms for amphibian sound classification (양서류 울음 소리 식별을 위한 특징 벡터 및 인식 알고리즘 성능 분석)

  • Park, Sangwook;Ko, Kyungdeuk;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.6
    • /
    • pp.401-406
    • /
    • 2017
  • This paper presents the performance assessment of several key algorithms conducted for amphibian species sound classification. Firstly, 9 target species including endangered species are defined and a database of their sounds is built. For performance assessment, three feature vectors such as MFCC (Mel Frequency Cepstral Coefficient), RCGCC (Robust Compressive Gammachirp filterbank Cepstral Coefficient), and SPCC (Subspace Projection Cepstral Coefficient), and three classifiers such as GMM(Gaussian Mixture Model), SVM(Support Vector Machine), DBN-DNN(Deep Belief Network - Deep Neural Network) are considered. In addition, i-vector based classification system which is widely used for speaker recognition, is used to assess for this task. Experimental results indicate that, SPCC-SVM achieved the best performance with 98.81 % while other methods also attained good performance with above 90 %.

A Study on the Signal Processing for Content-Based Audio Genre Classification (내용기반 오디오 장르 분류를 위한 신호 처리 연구)

  • 윤원중;이강규;박규식
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.6
    • /
    • pp.271-278
    • /
    • 2004
  • In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.