• 제목/요약/키워드: gaussian mixture model

Search Result 417, Processing Time 0.027 seconds

영상 통화 상황에서 안정적인 사람 영역 검출 방법

  • Heo, Seon;Gu, Hyeong-Il;Jo, Nam-Ik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2013.06a
    • /
    • pp.244-247
    • /
    • 2013
  • 본 논문에서는 영상 통화나 웹캠 혹은 화상 회의 상황의 비디오 영상에서 안정적으로 사람 영역과 배경을 분리하는 방법을 제안한다. 이 방법은 카메라가 고정이라는 등의 제약을 두지 않고 자유롭게 움직이는 비디오 영상에서 사용자의 입력도 필요 없이 자동으로 사람 영역을 분리해 내게 된다. 첫 프레임에서 얼굴 검출을 통해 사람의 대략적인 위치를 추측하여 배경과 사람 영역을 Gaussian Mixture Model 로 모델링하고, 매 프레임 이 모델을 효율적으로 갱신한다. 그리고 비디오 영상의 연속성을 에너지 함수 설계에 적용하여 프레임간 사람 영역의 변화가 크지 않고 안정적으로 나오게 된다. 제안하는 방법은 기존 방법들에 비하여 제약이 적고, 사용자 입력이 필요 없으며 안정적으로 사람 영역을 분리함을 실험을 통하여 확인하였다.

  • PDF

A Study of Continuous Speaker Recognition for Intelligent Responsive Space (지능형 반응공간을 위한 연속적 화자인식에 관한 연구)

  • Kwon, Soon-Il
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02a
    • /
    • pp.293-297
    • /
    • 2007
  • Human Computer Interaction 기술을 구체화 시키기 위한 Intelligent Responsive Space의 개발에 있어서 음성정보는 여러 가지로 유용하게 활용될 수 있다. 음성신호로부터 얻을 수 있는 다양한 정보 중의 하나가 화자인식을 이용한 화자의 신원식별이다. 이 논문에서는 화자인식 인식이 어려운 환경에서도 음성 신호로부터 추출한 특성벡터들을 선택적으로 사용함으로써 화자인식 성능을 높일 수 있는 새로운 방법을 제안하려 한다. 화자를 인식하는데 있어서 인식오류를 발생시킬 가능성이 높은 특성벡터들을 인식을 위한 판단의 대상에서 배제시킴으로써 성능을 향상시킬 수 있다. 실험결과에 의하면 0.25초에서2초 길이의 짧은 음성만으로도 기존의 방법에 비해 20에서 51%의 상대적 성능 향상을 보였다. 새롭게 제안된 방법을 적용하면 기존의 방법들에 비해 세밀하면서도 정확하게 연속적으로 화자들을 인식할 수 있게 된다.

  • PDF

A Classification Method Using Data Reduction

  • Uhm, Daiho;Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.1-5
    • /
    • 2012
  • Data reduction has been used widely in data mining for convenient analysis. Principal component analysis (PCA) and factor analysis (FA) methods are popular techniques. The PCA and FA reduce the number of variables to avoid the curse of dimensionality. The curse of dimensionality is to increase the computing time exponentially in proportion to the number of variables. So, many methods have been published for dimension reduction. Also, data augmentation is another approach to analyze data efficiently. Support vector machine (SVM) algorithm is a representative technique for dimension augmentation. The SVM maps original data to a feature space with high dimension to get the optimal decision plane. Both data reduction and augmentation have been used to solve diverse problems in data analysis. In this paper, we compare the strengths and weaknesses of dimension reduction and augmentation for classification and propose a classification method using data reduction for classification. We will carry out experiments for comparative studies to verify the performance of this research.

Detection of Abnormal Signals in Gas Pipes Using Neural Networks

  • Min, Hwang-Ki;Park, Cheol-Hoon
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.669-670
    • /
    • 2008
  • In this paper, we present a real-time system to detect abnormal events on gas pipes, based on the signals which are observed through the audio sensors attached on them. First, features are extracted from these signals so that they are robust to noise and invariant to the distance between a sensor and a spot at which an abnormal event like an attack on the gas pipes occurs. Then, a classifier is constructed to detect abnormal events using neural networks. It is a combination of two neural network models, a Gaussian mixture model and a multi-layer perceptron, for the reduction of miss and false alarms. The former works for miss alarm prevention and the latter for false alarm prevention. The experimental result with real data from the actual gas system shows that the proposed system is effective in detecting the dangerous events in real-time with an accuracy of 92.9%.

  • PDF

A Phonetic Study of 'Sasang Constitution' (음성학적으로 본 사상체질)

  • Moon, Seung-Jae;Tak, Ji-Hyun;Hwang, Hye-Jeong
    • Proceedings of the KSPS conference
    • /
    • 2005.04a
    • /
    • pp.63-66
    • /
    • 2005
  • Sasang Constitution, one branch of oriental medicine, claims that people can be classified into four different 'constitutions:' Taeyang, Taeum, Soyang, and Soeum. This study investigates whether the classification of the 'constitutions' could be accurately made solely based on people's voice by analyzing the data from 46 different voices whose constitutions were already determined. Seven source-related parameters and four filter-related parameters were phonetically analyzed and the GMM(gaussian mixture model) was tried with the data. Both the results from phonetic analyses and GMM showed that all the parameters (except one)failed to distinguish the constitutions of the people successfully. And even the single exception, the bandwidth of F2, did not provide us with sufficient reasons to be the source of distinction. This result seems to suggest one of the two conclusions: either the Sasang Constitutions cannot be substantiated with phonetic characteristics of peoples' voices with reliable accuracy, or we need to find yet some other parameters which haven't been conventionally proposed.

  • PDF

Detection using Optical Flow and EMD Algorithm and Tracking using Kalman Filter of Moving Objects (이동물체들의 Optical flow와 EMD 알고리즘을 이용한 식별과 Kalman 필터를 이용한 추적)

  • Lee, Jung Sik;Joo, Yung Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.64 no.7
    • /
    • pp.1047-1055
    • /
    • 2015
  • We proposes a method for improving the identification and tracking of the moving objects in intelligent video surveillance system. The proposed method consists of 3 parts: object detection, object recognition, and object tracking. First of all, we use a GMM(Gaussian Mixture Model) to eliminate the background, and extract the moving object. Next, we propose a labeling technique forrecognition of the moving object. and the method for identifying the recognized object by using the optical flow and EMD algorithm. Lastly, we proposes method to track the location of the identified moving object regions by using location information of moving objects and Kalman filter. Finally, we demonstrate the feasibility and applicability of the proposed algorithms through some experiments.

Personal Information Extraction Using A Microphone Array (마이크로폰어레이를 이용한 사용자 정보추출)

  • Kim, Hye-Jin;Yoon, Ho-Sub
    • The Journal of Korea Robotics Society
    • /
    • v.3 no.2
    • /
    • pp.131-136
    • /
    • 2008
  • This paper proposes a method to extract the personal information using a microphone array. Useful personal information, particularly customers, is age and gender. On the basis of this information, service applications for robots can satisfy users by offering services adaptive to the special needs of specific user groups that may include adults and children as well as females and males. We applied Gaussian Mixture Model (GMM) as a classifier and Mel Frequency Cepstral coefficients (MFCCs) as a voice feature. The major aim of this paper is to discover the voice source parameters of age and gender and to classify these two characteristics simultaneously. For the ubiquitous environment, voices obtained by the selected channels in a microphone array are useful to reduce background noise.

  • PDF

Noise Removal for Level Set based Flower Segmentation (레벨셋 기반 꽃 분할을 위한 노이즈 제거)

  • Park, Sang Cheol;Oh, Kang Han;Na, In Seop;Kim, Soo Hyung;Yang, Hyung Jeong;Lee, Guee Sang
    • Smart Media Journal
    • /
    • v.1 no.2
    • /
    • pp.34-39
    • /
    • 2012
  • In this paper, post-processing step is presented to remove noises and develop a fully automated scheme to segment flowers in natural scene images. The scheme to segment flowers using a level set algorithm in the natural scene images produced unexpected and isolated noises because the level set relies only on the color and edge information. The experimental results shows that the proposed method successfully removes noises in the foreground and background.

  • PDF

Classification of Phornographic Videos Based on the Audio Information (오디오 신호에 기반한 음란 동영상 판별)

  • Kim, Bong-Wan;Choi, Dae-Lim;Lee, Yong-Ju
    • MALSORI
    • /
    • no.63
    • /
    • pp.139-151
    • /
    • 2007
  • As the Internet becomes prevalent in our lives, harmful contents, such as phornographic videos, have been increasing on the Internet, which has become a very serious problem. To prevent such an event, there are many filtering systems mainly based on the keyword-or image-based methods. The main purpose of this paper is to devise a system that classifies pornographic videos based on the audio information. We use the mel-cepstrum modulation energy (MCME) which is a modulation energy calculated on the time trajectory of the mel-frequency cepstral coefficients (MFCC) as well as the MFCC as the feature vector. For the classifier, we use the well-known Gaussian mixture model (GMM). The experimental results showed that the proposed system effectively classified 98.3% of pornographic data and 99.8% of non-pornographic data. We expect the proposed method can be applied to the more accurate classification system which uses both video and audio information.

  • PDF

Speaker Recognition in the Intelligent Service Robot (지능형 서비스 로봇 환경에서의 화자 인식 연구)

  • Ban, Kyu-Dae;Kwak, Keun-Chang;Chung, Yun-Koo
    • Proceedings of the IEEK Conference
    • /
    • 2007.07a
    • /
    • pp.393-394
    • /
    • 2007
  • Speaker Recognition for the Intelligent Service Robot is implemented in this paper. For this purpose, we perform speaker recognition based on Gaussian Mixture Model(GMM) and use robot platform called WEVER, which is a Ubiquitous Robotic Companion(URC) intelligent service robot developed at Intelligent Robot Research Division in ETRI. The experimental results reveals that the approach presented in this paper yields a good identification (89.00%) performance within 2 meter distance.

  • PDF