• Title/Summary/Keyword: EM Clustering

Search Result 65, Processing Time 0.026 seconds

D2D Based Advertisement Dissemination Using Expectation Maximization Clustering (기대최대화 기반 사용자 클러스터링을 통한 D2D 광고 확산)

  • Kim, Junseon;Lee, Howon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.5
    • /
    • pp.992-998
    • /
    • 2017
  • For local advertising based on D2D communications, sources want advertisement messages to be diffused to unspecified users as many as possible. It is one of challenging issues to select target-areas for advertising if users are uniformly distributed. In this paper, we propose D2D based advertisement dissemination algorithm using user clustering with expectation-maximization. The user distribution of each cluster can be estimated by principal components (PCs) obtained from each cluster. That is, PCs enable the target-areas and routing paths to be properly determined according to the user distribution. Consequently, advertisement messages are able to be disseminated to many users. We evaluate performances of our proposed algorithm with respect to coverage probability and average reception number per user.

A Comparison of cluster analysis based on profile of LPGA player profile in 2009 (2009년 여자프로골프선수 프로파일을 이용한 군집방법비교)

  • Min, Dae-Kee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.3
    • /
    • pp.471-480
    • /
    • 2010
  • Cluster analysis is one of the useful methods to find out number of groups and member’s belongings. With the rapid development of computer application in statistics, variety of new methods in clustering analysis were studied such as EM algorism and Self organization maps. The goals of cluster analysis is finding the number of groupings that are meaningful to me. If data are analyzed perfectly with cluster analysis, we can get the same results from discernment analysis.

Comparative Study of Knowledge Extraction on the Industrial Application (산업분야에서의 지식 정보 추출에 대한 비교연구)

  • Woo, Young-Kwang;Kim, Sung-Sin;Bae, Hyun;Woo, Kwang-Bang
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.05a
    • /
    • pp.251-254
    • /
    • 2003
  • 데이터는 어떤 특성을 나타내는 언어적 또는 수치적 값들의 표현이다. 이러한 데이터들을 목적에 따라 구성한 것이 정보이며, 문제 해결이나 패턴 분류, 또는 의사 결정을 위해 정보들간의 관계를 규칙으로 체계화하는 것이 지식이다. 현재 대부분의 산업 분야에서 시스템에 대한 이해를 높이고 시스템의 성능을 향상시키기 위해 지식을 추출하고, 적용시키는 작업들이 활발히 이루어지고 있다. 지식 정보의 추출은 지식의 획득, 표현, 구현의 단계로 구성되며 이렇게 추출된 지식 정보는 규칙으로 도출된다. 본 논문에서는 여러 산업 분야에 걸쳐 다양하게 적용되는 지식 정보 추출 방법들에 대해 그 영역별로 알아보고 여러 시험 데이터들과 실제 시스템에 클러스터링(CL), 입력공간 분할(ISP), 뉴로-퍼지(NF), 신경망(NN), 확장 행렬(EM) 등의 방법들을 적용시킨 결과들을 비교 분석하고자 한다.

  • PDF

Electric Power Consumption Forecasting Method using Data Clustering (데이터 군집화를 이용한 전력 사용량 예측 기법)

  • Park, Jinwoong;Moon, Jihoon;Kim, Yongsung;Hwang, Eenjun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2016.04a
    • /
    • pp.571-574
    • /
    • 2016
  • 최근 에너지 효율을 최적화하는 차세대 지능형 전력망인 스마트 그리드 시스템(Smart Grid System)이 국내외에 널리 보급되고 있다. 그로 인해 그리드 시스템의 효율적인 운영을 위해 적용되는 EMS(Energy Management System) 기술의 중요성이 커지고 있다. EMS는 에너지 사용량 예측의 높은 정확성이 요구되며, 예측이 정확하게 수행될수록 에너지의 활용성이 높아진다. 본 논문은 전력 사용량 예측의 정확성 향상을 위한 새로운 기법을 제안한다. 구체적으로, 먼저 사용량에 영향을 미치는 환경적인 요인들을 분석한다. 분석된 요인들을 적용하여 유사한 환경을 가지는 전력 사용량 데이터의 사전 군집화를 수행한다. 그리고 예측 일에 관련된 환경 정보와 가장 유사한 군집의 전력 사용량 데이터를 기반으로 전력 사용량을 예측한다. 제안하는 기법의 성능을 평가하기 위해, 다양한 실험을 통하여 일간 전력 사용량을 예측하고 그 정확성을 측정하였다. 결과적으로, 기존의 기법들과 비교했을 때, 최대 52.88% 향상된 전력 사용량 예측 정확성을 보였다.

Analysis on Temporal Pattern of Location Data with Time Series Model (시계열 모델을 활용한 위치 데이터의 시간적 패턴 분석)

  • Song, Ha Yoon;Lee, Da Som;Jung, Jun Woo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.768-771
    • /
    • 2021
  • 시계열 분석은 이전 시점들의 데이터를 기반으로 미래 시점의 데이터를 예측하는 기술을 제공하며, SARIMA는 이러한 시계열 분석에서 활용되는 통계 모델의 일종이다. 본 연구는 직접 수집한 실시간 위치 데이터에 SARIMA를 적용하여 개인의 이동 패턴을 추출하고 이를 예측에 활용하는 전반적인 프로세스를 제작하였다. 첫째, DB에 업로드된 위치 데이터를 비지도 학습의 일종인 EM-clustering을 활용해 핵심 방문 장소들로부터의 거리에 따라 군집화했다. 둘째, 해당 장소에 입장하고 퇴장하는 시간 간격에 SARIMA를 적용해 주기성을 추출했다. 마지막으로, 이 주기성들을 군집의 중요도에 따라 순차적으로 분석하여 유의미한 예측 결과를 도출해냈다.

Probabilistic reduced K-means cluster analysis (확률적 reduced K-means 군집분석)

  • Lee, Seunghoon;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.905-922
    • /
    • 2021
  • Cluster analysis is one of unsupervised learning techniques used for discovering clusters when there is no prior knowledge of group membership. K-means, one of the commonly used cluster analysis techniques, may fail when the number of variables becomes large. In such high-dimensional cases, it is common to perform tandem analysis, K-means cluster analysis after reducing the number of variables using dimension reduction methods. However, there is no guarantee that the reduced dimension reveals the cluster structure properly. Principal component analysis may mask the structure of clusters, especially when there are large variances for variables that are not related to cluster structure. To overcome this, techniques that perform dimension reduction and cluster analysis simultaneously have been suggested. This study proposes probabilistic reduced K-means, the transition of reduced K-means (De Soete and Caroll, 1994) into a probabilistic framework. Simulation shows that the proposed method performs better than tandem clustering or clustering without any dimension reduction. When the number of the variables is larger than the number of samples in each cluster, probabilistic reduced K-means show better formation of clusters than non-probabilistic reduced K-means. In the application to a real data set, it revealed similar or better cluster structure compared to other methods.

Depth Map Pre-processing using Gaussian Mixture Model and Mean Shift Filter (혼합 가우시안 모델과 민쉬프트 필터를 이용한 깊이 맵 부호화 전처리 기법)

  • Park, Sung-Hee;Yoo, Ji-Sang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.5
    • /
    • pp.1155-1163
    • /
    • 2011
  • In this paper, we propose a new pre-processing algorithm applied to depth map to improve the coding efficiency. Now, 3DV/FTV group in the MPEG is working for standard of 3DVC(3D video coding), but compression method for depth map images are not confirmed yet. In the proposed algorithm, after dividing the histogram distribution of a given depth map by EM clustering method based on GMM, we classify the depth map into several layered images. Then, we apply different mean shift filter to each classified image according to the existence of background or foreground in it. In other words, we try to maximize the coding efficiency while keeping the boundary of each object and taking average operation toward inner field of the boundary. The experiments are performed with many test images and the results show that the proposed algorithm achieves bits reduction of 19% ~ 20% and computation time is also reduced.

Unsupervised Image Classification through Multisensor Fusion using Fuzzy Class Vector (퍼지 클래스 벡터를 이용하는 다중센서 융합에 의한 무감독 영상분류)

  • 이상훈
    • Korean Journal of Remote Sensing
    • /
    • v.19 no.4
    • /
    • pp.329-339
    • /
    • 2003
  • In this study, an approach of image fusion in decision level has been proposed for unsupervised image classification using the images acquired from multiple sensors with different characteristics. The proposed method applies separately for each sensor the unsupervised image classification scheme based on spatial region growing segmentation, which makes use of hierarchical clustering, and computes iteratively the maximum likelihood estimates of fuzzy class vectors for the segmented regions by EM(expected maximization) algorithm. The fuzzy class vector is considered as an indicator vector whose elements represent the probabilities that the region belongs to the classes existed. Then, it combines the classification results of each sensor using the fuzzy class vectors. This approach does not require such a high precision in spatial coregistration between the images of different sensors as the image fusion scheme of pixel level does. In this study, the proposed method has been applied to multispectral SPOT and AIRSAR data observed over north-eastern area of Jeollabuk-do, and the experimental results show that it provides more correct information for the classification than the scheme using an augmented vector technique, which is the most conventional approach of image fusion in pixel level.

Bayesian analysis of finite mixture model with cluster-specific random effects (군집 특정 변량효과를 포함한 유한 혼합 모형의 베이지안 분석)

  • Lee, Hyejin;Kyung, Minjung
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.57-68
    • /
    • 2017
  • Clustering algorithms attempt to find a partition of a finite set of objects in to a potentially predetermined number of nonempty subsets. Gibbs sampling of a normal mixture of linear mixed regressions with a Dirichlet prior distribution calculates posterior probabilities when the number of clusters was known. Our approach provides simultaneous partitioning and parameter estimation with the computation of classification probabilities. A Monte Carlo study of curve estimation results showed that the model was useful for function estimation. Examples are given to show how these models perform on real data.

Extensions of X-means with Efficient Learning the Number of Clusters (X-means 확장을 통한 효율적인 집단 개수의 결정)

  • Heo, Gyeong-Yong;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.4
    • /
    • pp.772-780
    • /
    • 2008
  • K-means is one of the simplest unsupervised learning algorithms that solve the clustering problem. However K-means suffers the basic shortcoming: the number of clusters k has to be known in advance. In this paper, we propose extensions of X-means, which can estimate the number of clusters using Bayesian information criterion(BIC). We introduce two different versions of algorithm: modified X-means(MX-means) and generalized X-means(GX-means), which employ one full covariance matrix for one cluster and so can estimate the number of clusters efficiently without severe over-fitting which X-means suffers due to its spherical cluster assumption. The algorithms start with one cluster and try to split a cluster iteratively to maximize the BIC score. The former uses K-means algorithm to find a set of optimal clusters with current k, which makes it simple and fast. However it generates wrongly estimated centers when the clusters are overlapped. The latter uses EM algorithm to estimate the parameters and generates more stable clusters even when the clusters are overlapped. Experiments with synthetic data show that the purposed methods can provide a robust estimate of the number of clusters and cluster parameters compared to other existing top-down algorithms.