• 제목/요약/키워드: K-medoids

검색결과 22건 처리시간 0.025초

새로운 K-medoids 군집방법 및 성능 비교 (Performance Comparison of Some K-medoids Clustering Algorithms)

  • 박해상;이상호;전치혁
    • 한국경영과학회:학술대회논문집
    • /
    • 한국경영과학회 2006년도 추계학술대회
    • /
    • pp.421-426
    • /
    • 2006
  • We propose a new algorithm for K-medoids clustering which runs like the K-means clustering algorithm and test several methods for selecting initial medoids. The proposed algorithm calculates similarity matrix once and uses it for finding new medoids at every iterative step. To evaluate the proposed algorithm we use real and artificial data and compare with the clustering results of other algorithms in terms of three performance measures. Experimental results show that the proposed algorithm takes the reduced time in computation with comparable performance as compared to the Partitioning Around Medoids.

  • PDF

Automatic identification of Java Method Naming Patterns Using Cascade K-Medoids

  • Kim, Tae-young;Kim, Suntae;Kim, Jeong-Ah;Choi, Jae-Young;Lee, Jee-Huong;Cho, Youngwha;Nam, Young-Kwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권2호
    • /
    • pp.873-891
    • /
    • 2018
  • This paper suggests an automatic approach to extracting Java method implementation patterns associated with method identifiers using Cascade K-Medoids. Java method implementation patterns indicate recurring implementations for achieving the purpose described in the method identifier with the given parameters and return type. If the implementation is different from the purpose, readers of the code tend to take more time to comprehend the method, which eventually affects to the increment of software maintenance cost. In order to automatically identify implementation patterns and its representative sample code, we first propose three groups of feature vectors for characterizing the Java method signature, method body and their relation. Then, we apply Cascade K-Medoids by enhancing the K-Medoids algorithm with the Calinski and Harrabasez algorithm. As the evaluation of our approach, we identified 16,768 implementation patterns of 7,169 method identifiers from 50 open source projects. The implementation patterns have been validated by the 30 industrial practitioners with from 1 to 6 years industrial experience, resulting in 86% of the precision.

A K-means-like Algorithm for K-medoids Clustering

  • 이종석;박해상;전치혁
    • 한국경영과학회:학술대회논문집
    • /
    • 한국경영과학회 2005년도 추계학술대회 및 정기총회
    • /
    • pp.51-54
    • /
    • 2005
  • Clustering analysis is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes. In this paper we propose a new algorithm for K-medoids clustering which runs like the K-means algorithm. The new algorithm calculates distance matrix once and uses it for finding new medoids at every iterative step. We evaluate the proposed method using real and synthetic data and compare with the results of other algorithms. The proposed algorithm takes reduced time in computation and better performance than others.

  • PDF

보정된 K-medoids 군집화 기법과 이분 탐색기법을 이용한 RBF 네트워크의 중심 개수와 위치와 통합 결정 (Determining the Number and the Locations of RBF Centers Using Enhanced K-Medoids Clustering and Bi-Section Search Method)

  • 이대원;이재욱
    • 대한산업공학회지
    • /
    • 제29권2호
    • /
    • pp.172-178
    • /
    • 2003
  • In the recent researches, a variety of ways for determining the locations of RBF centers have been proposed assuming that the number of RBF centers is known. But they have also many numerical drawbacks. We propose a new method to overcome such drawbacks. The strength of our method is to determine the locations and the number of RBF centers at the same time without any assumption about the number of RBF centers. The proposed method consists of two phases. The first phase is to determine the number and the locations of RBF centers using bi-section search method and enhanced k-medoids clustering which overcomes drawbacks of clustering algorithm. In the second phase, network weights are computed and the design of RBF network is completed. This new method is applied to several benchmark data sets. Benchmark results show that the proposed method is competitive with the previously reported approaches for center selection.

Medoid Determination in Deterministic Annealing-based Pairwise Clustering

  • Lee, Kyung-Mi;Lee, Keon-Myung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제11권3호
    • /
    • pp.178-183
    • /
    • 2011
  • The deterministic annealing-based clustering algorithm is an EM-based algorithm which behaves like simulated annealing method, yet less sensitive to the initialization of parameters. Pairwise clustering is a kind of clustering technique to perform clustering with inter-entity distance information but not enforcing to have detailed attribute information. The pairwise deterministic annealing-based clustering algorithm repeatedly alternates the steps of estimation of mean-fields and the update of membership degrees of data objects to clusters until termination condition holds. Lacking of attribute value information, pairwise clustering algorithms do not explicitly determine the centroids or medoids of clusters in the course of clustering process or at the end of the process. This paper proposes a method to identify the medoids as the centers of formed clusters for the pairwise deterministic annealing-based clustering algorithm. Experimental results show that the proposed method locate meaningful medoids.

Performance Evaluation of k-means and k-medoids in WSN Routing Protocols

  • SeaYoung, Park;Dai Yeol, Yun;Chi-Gon, Hwang;Daesung, Lee
    • Journal of information and communication convergence engineering
    • /
    • 제20권4호
    • /
    • pp.259-264
    • /
    • 2022
  • In wireless sensor networks, sensor nodes are often deployed in large numbers in places that are difficult for humans to access. However, the energy of the sensor node is limited. Therefore, one of the most important considerations when designing routing protocols in wireless sensor networks is minimizing the energy consumption of each sensor node. When the energy of a wireless sensor node is exhausted, the node can no longer be used. Various protocols are being designed to minimize energy consumption and maintain long-term network life. Therefore, we proposed KOCED, an optimal cluster K-means algorithm that considers the distances between cluster centers, nodes, and residual energies. I would like to perform a performance evaluation on the KOCED protocol. This is a study for energy efficiency and validation. The purpose of this study is to present performance evaluation factors by comparing the K-means algorithm and the K-medoids algorithm, one of the recently introduced machine learning techniques, with the KOCED protocol.

IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템 (Personalized Recommendation System for IPTV using Ontology and K-medoids)

  • 윤병대;김종우;조용석;강상길
    • 지능정보연구
    • /
    • 제16권3호
    • /
    • pp.147-161
    • /
    • 2010
  • 최근 방송과 통신의 융합으로 TV에 통신이라는 기술이 접목되면서, TV 시청 형태에 많은 변화를 가져왔다. 이러한 형태의 TV 시청 변화는 서비스 선택의 폭을 넓혀주지만 프로그램을 선택을 위해 많은 시간을 투자해야 한다. 이러한 단점을 개선하기 위해서 본 논문에서는 IPTV환경에서 사용자의 다양한 콘텐츠를 제공하는 방송 환경에서 고객의 시청 정보를 바탕으로 고객 사용정보 온톨로지를 구축하고 그에 따라 고객을 k-medoids 방법을 이용해서 클러스터링 한다. 이를 바탕으로 고객이 선호하는 콘텐츠를 추천 하는 방법을 제안하였다. 실험부분에서 본 제안방법의 우수성을 기존의 방법과 비교하여 보여준다.

A Computational Intelligence Based Online Data Imputation Method: An Application For Banking

  • Nishanth, Kancherla Jonah;Ravi, Vadlamani
    • Journal of Information Processing Systems
    • /
    • 제9권4호
    • /
    • pp.633-650
    • /
    • 2013
  • All the imputation techniques proposed so far in literature for data imputation are offline techniques as they require a number of iterations to learn the characteristics of data during training and they also consume a lot of computational time. Hence, these techniques are not suitable for applications that require the imputation to be performed on demand and near real-time. The paper proposes a computational intelligence based architecture for online data imputation and extended versions of an existing offline data imputation method as well. The proposed online imputation technique has 2 stages. In stage 1, Evolving Clustering Method (ECM) is used to replace the missing values with cluster centers, as part of the local learning strategy. Stage 2 refines the resultant approximate values using a General Regression Neural Network (GRNN) as part of the global approximation strategy. We also propose extended versions of an existing offline imputation technique. The offline imputation techniques employ K-Means or K-Medoids and Multi Layer Perceptron (MLP)or GRNN in Stage-1and Stage-2respectively. Several experiments were conducted on 8benchmark datasets and 4 bank related datasets to assess the effectiveness of the proposed online and offline imputation techniques. In terms of Mean Absolute Percentage Error (MAPE), the results indicate that the difference between the proposed best offline imputation method viz., K-Medoids+GRNN and the proposed online imputation method viz., ECM+GRNN is statistically insignificant at a 1% level of significance. Consequently, the proposed online technique, being less expensive and faster, can be employed for imputation instead of the existing and proposed offline imputation techniques. This is the significant outcome of the study. Furthermore, GRNN in stage-2 uniformly reduced MAPE values in both offline and online imputation methods on all datasets.

수소 충전소 최적 위치 선정을 위한 기계 학습 기반 방법론 (A Machine Learning based Methodology for Selecting Optimal Location of Hydrogen Refueling Stations)

  • 김수환;류준형
    • Korean Chemical Engineering Research
    • /
    • 제58권4호
    • /
    • pp.573-580
    • /
    • 2020
  • 최근 석유를 대체할 수송 에너지원으로 수소에 대한 관심이 커지고 있다. 수소의 장점을 극대화하기 위해서는 수소 충전소가 많이 보급되어야 한다. 본 논문은 수소 충전소를 보다 가깝게 이용 할 수 있는 최적 위치 선정 방법론을 제안하였다. 기존 에너지의 공급처인 주유소와 천연가스 충전소의 위치를 우선 참고하고, 인구, 등록 차량 수 등의 데이터를 추가 반영하여 수소자동차의 예상 충전 수요를 계산하였다. 기계 학습(machine learning) 기법 중 하나인 k-중심자 군집화(k-medoids Clustering)를 이용하여 예상 수요에 대응하는 최적 수소 충전소 위치를 계산하였다. 제안된 방법의 우수성은 서울의 사례를 통해 수치적으로 설명하였다. 본 방법론과 같은 데이터 기반 방법은 향후 수소의 보급 속도를 높여 환경친화적인 경제 체계를 구축하는데 기여할 수 있을 것이다.

그룹특징기반 슬라이딩 윈도우 클러스터링에서의 k-means와 k-medoids 비교 평가 (Comparison between k-means and k-medoids Algorithms for a Group-Feature based Sliding Window Clustering)

  • 양주연;심준호
    • 한국전자거래학회지
    • /
    • 제23권3호
    • /
    • pp.225-237
    • /
    • 2018
  • 대용량 데이터의 발생과 처리가 대중화되면서 대용량 데이터 스트림 처리에 대한 수요가 급격하게 증가하고 있다. 이 수요에 따라 다양한 대용량 데이터 처리 기술이 개발되고 있다. 한 분야로 주목받고 있는 방식은 슬라이딩 윈도우를 사용한 데이터 스트림 클러스터링이다. 슬라이딩 윈도우를 사용한 데이터 스트림 클러스터링은 윈도우가 이동할 때마다 새로운 클러스터를 생성한다. 기존의 슬라이딩 윈도우 상의 클러스터링 기법은 코어셋(Coreset)을 기반으로 데이터 스트림 클러스터링을 구현하고 있다. 이 연구에서는 코어셋을 활용한 그룹특징을 이용한 알고리즘 내에서 이용하는 클러스터링 알고리즘을 변경하였다. 그리고 이를 통해 제안 알고리즘과 기존 알고리즘의 파라미터 값 변화에 따른 성능 비교 실험을 진행하였다. 개선된 사항에 대해 논하여 두 알고리즘을 비교하고 실험자에게 파라미터에 따른 이용 방향을 제시한다.