• 제목/요약/키워드: GMM parameters

검색결과 60건 처리시간 0.022초

마이크로폰어레이를 이용한 사용자 정보추출 (Personal Information Extraction Using A Microphone Array)

  • 김혜진;윤호섭
    • 로봇학회논문지
    • /
    • 제3권2호
    • /
    • pp.131-136
    • /
    • 2008
  • This paper proposes a method to extract the personal information using a microphone array. Useful personal information, particularly customers, is age and gender. On the basis of this information, service applications for robots can satisfy users by offering services adaptive to the special needs of specific user groups that may include adults and children as well as females and males. We applied Gaussian Mixture Model (GMM) as a classifier and Mel Frequency Cepstral coefficients (MFCCs) as a voice feature. The major aim of this paper is to discover the voice source parameters of age and gender and to classify these two characteristics simultaneously. For the ubiquitous environment, voices obtained by the selected channels in a microphone array are useful to reduce background noise.

  • PDF

음성/음악 판별을 위한 특징 파라미터와 분류기의 성능비교 (Performance Comparison of Feature Parameters and Classifiers for Speech/Music Discrimination)

  • 김형순;김수미
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.37-50
    • /
    • 2003
  • In this paper, we evaluate and compare the performance of speech/music discrimination based on various feature parameters and classifiers. As for feature parameters, we consider High Zero Crossing Rate Ratio (HZCRR), Low Short Time Energy Ratio (LSTER), Spectral Flux (SF), Line Spectral Pair (LSP) distance, entropy and dynamism. We also examine three classifiers: k Nearest Neighbor (k-NN), Gaussian Mixure Model (GMM), and Hidden Markov Model (HMM). According to our experiments, LSP distance and phoneme-recognizer-based feature set (entropy and dunamism) show good performance, while performance differences due to different classifiers are not significant. When all the six feature parameters are employed, average speech/music discrimination accuracy up to 96.6% is achieved.

  • PDF

결정적 어닐링 EM 알고리즘을 이용한 칼라 영상의 분할 (Segmentation of Color Image Using the Deterministic Anneanling EM Algorithm)

  • 박종현;박순영;조완현
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1999년도 추계종합학술대회 논문집
    • /
    • pp.569-572
    • /
    • 1999
  • In this paper we present a color image segmentation algorithm based on statistical models. A novel deterministic annealing Expectation Maximization(EM) formula is derived to estimate the parameters of the Gaussian Mixture Model(GMM) which represents the multi-colored objects statistically. The experimental results show that the proposed deterministic annealing EM is a global optimal solution for the ML parameter estimation and the image field is segmented efficiently by using the parameter estimates.

  • PDF

가우시안 혼합 모델을 이용한 이동 객체 검출 알고리듬의 하드웨어 구현 (A Hardware Implementation of Moving Object Detection Algorithm using Gaussian Mixture Model)

  • 김경훈;안효식;신경욱
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2015년도 춘계학술대회
    • /
    • pp.407-409
    • /
    • 2015
  • 가우시안 혼합 모델(GMM)과 배경 차분 기법을 이용한 이동 객체 검출(MOD) 알고리듬을 하드웨어로 구현하였다. 구현된 MOD 프로세서는 EGML(Effective Gaussian Mixture Learning)을 기반으로 배경을 생성하고 업데이트하며, EGML 계산 일부의 근사화를 통해 하드웨어 복잡도를 줄였고, 파이프라이닝 기법을 통해 동작속도를 개선하였다. 또한 가우시안 파라미터들을 가변시킬 수 있도록 함으로써 다양한 조건에서 이동 객체 검출 성능이 향상되도록 구현하였다. 설계된 회로는 FPGA-in-the-loop방식으로 하드웨어 동작을 검증하였으며, XC5VSX95T FPGA 디바이스에서 최대 109 MHz의 클록 주파수로 동작 가능한 것으로 평가되었다.

  • PDF

형태계수의 Mixture Model을 이용한 입술 형태 표현과 입술 경계선 추출 (Lip Shape Representation and Lip Boundary Detection Using Mixture Model of Shape)

  • 장경식;이임건
    • 한국멀티미디어학회논문지
    • /
    • 제7권11호
    • /
    • pp.1531-1539
    • /
    • 2004
  • 본 논문은 입술의 경계선을 효과적으로 추출하는 방법을 제안하였다. 입술 형태는 PDM(Point Distribution Model)과 주성분 분석법을 이용하여 표현하고 입술 경계선은 GLDM(Gray Level Distribution Model)을 기반으로 표현하였다 입술 경계선 추출은 모델에 대한 입력영상의 정확도에 대한 목적함수를 최적화하는 문제로 단순화하였으며, 최적화를 위해 다운힐 심플렉스(Down Hill Simplex) 알고리즘을 이용하였다. 탐색과정에서 지역 최소점으로 수렴하는 문제를 해결하기 위하여 입술 형태 모델의 형태계수를 GMM(Gaussian Mixture Model)을 이용하여 표현하였다. 형태계수에 대한 GMM을 이용하여 입술의 대략적인 형태를 찾고, 이때 사용된 mixture 성분을 이용하여 탐색과정에서 입술의 형태를 조정함으로써 지역 최소점에 수렴하여 입술의 정확한 위치를 찾지 못하는 문제점을 해결하였다. 여러 영상을 대상으로 실험하여 좋은 결과를 얻었다.

  • PDF

실물옵션 적용을 위한 산업별 기초자산 확률과정추정 (Identification of the Movement of Underlying Asset in Real Option Analysis: Studies on Industrial Parametric Table)

  • 이정동;강아리;정종욱
    • 기술경영경제학회:학술대회논문집
    • /
    • 기술경영경제학회 2004년도 제24회 동계학술대회 논문집
    • /
    • pp.222-245
    • /
    • 2004
  • This paper has an intention of proposing useful parametric tables of each industry group within Korea. These parametric tables can be insightful criteria for those who are dealing with the exact valuation of company, technology or industry through Real Option Analysis (ROA) since the identification of the movement of underlying asset is the very first step to be done. To give the exact estimations of parameters and the most preferred model in each industry group, we cover topics on ROA, stochastic process, and parametric estimation method like Generalized Method of Moments (GMM) and Maximum Likelihood Estimation (MLE). Additionally, specific industry groups, such as, Internet service group and mobile telecommunication service group defined independently in this paper are also examined in terms of its property of movement with the suggesting of the most fitting stochastic model.

  • PDF

프레임레벨유사도정규화를 적용한 문맥독립화자식별시스템의 구현 (Realization a Text Independent Speaker Identification System with Frame Level Likelihood Normalization)

  • 김민정;석수영;김광수;정현열
    • 융합신호처리학회논문지
    • /
    • 제3권1호
    • /
    • pp.8-14
    • /
    • 2002
  • 본 논문에서는 Gaussian mixture model을 이용한 실시간 문맥독립화자식별시스템을 구현하여 인식실험을 수행하였으며, 인식시스템의 성능을 향상시키기 위하여 화자검증시스템에서 좋은 결과를 보인 유사도 정규화(Likelihood normalization)방법을 적용하여 인식실험을 하였다. 시스템은 크게 전처리단과 화자모델생성단, 화자식별단으로 나누어진다. 전처리단에서는 화자의 발성변화를 고려하여 CMN(Cepstral mean normalization)과 Silence removal 방법을 적용하였다. 화자모델생성단에서는, 화자발성의 음향학적 특징을 잘 표현할 수 있는 GMM(Gaussian mixture model)을 이용하여 화자모델을 작성하였으며, GMM의 파라미터를 최적화하기 위하여 MLE(Maximum likelihood estimation)방법을 사용하였다. 화자식별단에서는 학습된 데이터와 테스트용 데이터로부터 ML(Maximum likelihood)을 이용하여 유사도를 계산하였으며, 이 과정에서 유사도 정규화를 적용한 경우에는 프레임단위로 유사도를 계산하게 된다. 계산된 유사도는 스코어(S$_{C}$)로 표현하였고, 가장 높은 스코어를 가지는 화자가 인식화자로 결정된다. 화자인식에서 발성의 종류로는 문맥독립 문장을 사용하였다. 인식실험을 위해서는 ETRI445 DB와 KLE452 DB를 사용하였으며, 특징파라미터로서는 켑스트럼계수 및 회귀계수값만을 사용하였다. 인식실험에서는 등록화자의 수를 달리하여 일반적인 화자식별방법과 프레임단위유사도정규화방법으로 각각 인식실험을 하였다. 인식실험결과, 프레임단위유사도정규화방법이 인식화자수가 많아지는 경우에 일반적인 방법보다 향상된 인식률을 얻을 수 있었다.

  • PDF

Multi-Level Segmentation of Infrared Images with Region of Interest Extraction

  • Yeom, Seokwon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제16권4호
    • /
    • pp.246-253
    • /
    • 2016
  • Infrared (IR) imaging has been researched for various applications such as surveillance. IR radiation has the capability to detect thermal characteristics of objects under low-light conditions. However, automatic segmentation for finding the object of interest would be challenging since the IR detector often provides the low spatial and contrast resolution image without color and texture information. Another hindrance is that the image can be degraded by noise and clutters. This paper proposes multi-level segmentation for extracting regions of interest (ROIs) and objects of interest (OOIs) in the IR scene. Each level of the multi-level segmentation is composed of a k-means clustering algorithm, an expectation-maximization (EM) algorithm, and a decision process. The k-means clustering initializes the parameters of the Gaussian mixture model (GMM), and the EM algorithm estimates those parameters iteratively. During the multi-level segmentation, the area extracted at one level becomes the input to the next level segmentation. Thus, the segmentation is consecutively performed narrowing the area to be processed. The foreground objects are individually extracted from the final ROI windows. In the experiments, the effectiveness of the proposed method is demonstrated using several IR images, in which human subjects are captured at a long distance. The average probability of error is shown to be lower than that obtained from other conventional methods such as Gonzalez, Otsu, k-means, and EM methods.

화자확인에서 특징벡터의 순시 정보와 선형 변환의 효과적인 적용 (Effective Combination of Temporal Information and Linear Transformation of Feature Vector in Speaker Verification)

  • 서창우;조미화;임영환;전성채
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.127-132
    • /
    • 2009
  • The feature vectors which are used in conventional speaker recognition (SR) systems may have many correlations between their neighbors. To improve the performance of the SR, many researchers adopted linear transformation method like principal component analysis (PCA). In general, the linear transformation of the feature vectors is based on concatenated form of the static features and their dynamic features. However, the linear transformation which based on both the static features and their dynamic features is more complex than that based on the static features alone due to the high order of the features. To overcome these problems, we propose an efficient method that applies linear transformation and temporal information of the features to reduce complexity and improve the performance in speaker verification (SV). The proposed method first performs a linear transformation by PCA coefficients. The delta parameters for temporal information are then obtained from the transformed features. The proposed method only requires 1/4 in the size of the covariance matrix compared with adding the static and their dynamic features for PCA coefficients. Also, the delta parameters are extracted from the linearly transformed features after the reduction of dimension in the static features. Compared with the PCA and conventional methods in terms of equal error rate (EER) in SV, the proposed method shows better performance while requiring less storage space and complexity.

  • PDF

An Intelligent Automatic Early Detection System of Forest Fire Smoke Signatures using Gaussian Mixture Model

  • Yoon, Seok-Hwan;Min, Joonyoung
    • Journal of Information Processing Systems
    • /
    • 제9권4호
    • /
    • pp.621-632
    • /
    • 2013
  • The most important things for a forest fire detection system are the exact extraction of the smoke from image and being able to clearly distinguish the smoke from those with similar qualities, such as clouds and fog. This research presents an intelligent forest fire detection algorithm via image processing by using the Gaussian Mixture model (GMM), which can be applied to detect smoke at the earliest time possible in a forest. GMMs are usually addressed by making the model adaptive so that its parameters can track changing illuminations and by making the model more complex so that it can represent multimodal backgrounds more accurately for smoke plume segmentation in the forest. Also, in this paper, we suggest a way to classify the smoke plumes via a feature extraction using HSL(Hue, Saturation and Lightness or Luminanace) color space analysis.