• Title/Summary/Keyword: Clustering analysis

Search Result 2,088, Processing Time 0.033 seconds

Hierarchical Clustering of Symbolic Objects based on Asymmetric Proximity (비대칭적 유사도 기반의 심볼릭 객체의 계층적 클러스터링)

  • Oh, Seung-Joon;Park, Chan-Woong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.729-734
    • /
    • 2012
  • Clustering analysis has been widely used in numerous applications like pattern recognition, data analysis, intrusion detection, image processing, bioinformatics and so on. Much of previous work has been based on the numeric data only. However, symbolic data analysis has emerged to deal with variables that can have intervals, histograms, and even functions as values. In this paper, we propose a non symmetric proximity based clustering approach for symbolic objects. A method for clustering symbolic patterns based on the average similarity value(ASV) is explored. The results of the proposed clustering method differ from those of the existing methods and the results are very encouraging.

Improvement on Fuzzy C-Means Using Principal Component Analysis

  • Choi, Hang-Suk;Cha, Kyung-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.301-309
    • /
    • 2006
  • In this paper, we show the improved fuzzy c-means clustering method. To improve, we use the double clustering as principal component analysis from objects which is located on common region of more than two clusters. In addition we use the degree of membership (probability) of fuzzy c-means which is the advantage. From simulation result, we find some improvement of accuracy in data of the probability 0.7 exterior and interior of overlapped area.

  • PDF

Finding the Number of Clusters and Various Experiments Based on ASA Clustering Method (ASA 군집화를 이용한 군집수 결정 및 다양한 실험)

  • Yoon Bok-Sik
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.31 no.2
    • /
    • pp.87-98
    • /
    • 2006
  • In many cases of cluster analysis we are forced to perform clustering without any prior knowledge on the number of clusters. But in some clustering methods such as k-means algorithm it is required to provide the number of clusters beforehand. In this study, we focus on the problem to determine the number of clusters in the given data. We follow the 2 stage approach of ASA clustering algorithm and mainly try to improve the performance of the first stage of the algorithm. We verify the usefulness of the method by applying it for various kinds of simulated data. Also, we apply the method for clustering two kinds of real life qualitative data.

Clustering of Decision Making Units using DEA (DEA를 이용한 의사결정단위의 클러스터링)

  • Kim, Kyeongtaek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.37 no.4
    • /
    • pp.239-244
    • /
    • 2014
  • The conventional clustering approaches are mostly based on minimizing total dissimilarity of input and output. However, the clustering approach may not be helpful in some cases of clustering decision making units (DMUs) with production feature converting multiple inputs into multiple outputs because it does not care converting functions. Data envelopment analysis (DEA) has been widely applied for efficiency estimation of such DMUs since it has non-parametric characteristics. We propose a new clustering method to identify groups of DMUs that are similar in terms of their input-output profiles. A real world example is given to explain the use and effectiveness of the proposed method. And we calculate similarity value between its result and the result of a conventional clustering method applied to the example. After the efficiency value was added to input of K-means algorithm, we calculate new similarity value and compare it with the previous one.

Comparison of time series clustering methods and application to power consumption pattern clustering

  • Kim, Jaehwi;Kim, Jaehee
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.6
    • /
    • pp.589-602
    • /
    • 2020
  • The development of smart grids has enabled the easy collection of a large amount of power data. There are some common patterns that make it useful to cluster power consumption patterns when analyzing s power big data. In this paper, clustering analysis is based on distance functions for time series and clustering algorithms to discover patterns for power consumption data. In clustering, we use 10 distance measures to find the clusters that consider the characteristics of time series data. A simulation study is done to compare the distance measures for clustering. Cluster validity measures are also calculated and compared such as error rate, similarity index, Dunn index and silhouette values. Real power consumption data are used for clustering, with five distance measures whose performances are better than others in the simulation.

Clustering Analysis with Spring Discharge Data and Evaluation of Groundwater System in Jeju Island (용천수 유출량 클러스터링 해석을 이용한 제주도 지하수 순환 해석)

  • Kim Tae-Hui;Mun Deok-Cheol;Park Won-Bae;Park Gi-Hwa;Go Gi-Won
    • Proceedings of the Korean Society of Soil and Groundwater Environment Conference
    • /
    • 2005.04a
    • /
    • pp.296-299
    • /
    • 2005
  • Time series of spring discharge data in Jeju island can provide abundant information on the spatial groundwater system. In this study, the classification based on time series of spring discharge was performed with clustering analysis: discharge rate and EC. Peak discharges are mainly observed in august or september. However, double peaks and late peaks of discharge are also observed at a plenty of springs. Based on results of clustering analysis, it can be deduced that GH model is not appropriate for the conceptual model of Groundwater system in Jeju island. EC distributions in dry season are also support the conclusion.

  • PDF

The Experimental Study on the Relationship between Hierarchical Agglomerative Clustering and Compound Nouns Indexing (계층적 결합형 문서 클러스터링 시스템과 복합명사 색인방법과의 연관관계 연구)

  • Cho Hyun-Yang;Choi Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.38 no.4
    • /
    • pp.179-192
    • /
    • 2004
  • In this paper, we present that the result of document clustering can change dramatically with respect to the different ways of indexing compound nouns. First of all, the automatic indexing engine specialized for Korean words analysis, which also serves as the backbone engine for automatic document clustering system, is introduced. Then, the details of hierarchical agglomerative clustering(HAC) method, one of the widely used clustering methodologies in these days, was illustrated. As the result of observing the experiments, carried out in the final part of this paper, it comes to the conclusion that the various modes of indexing compound nouns have an effect on the outcome of HAC.

A novel clustering method for examining and analyzing the intellectual structure of a scholarly field (지적 구조 분석을 위한 새로운 클러스터링 기법에 관한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.23 no.4 s.62
    • /
    • pp.215-231
    • /
    • 2006
  • Recently there are many bibliometric studies attempting to utilize Pathfinder networks(PFNets) for examining and analyzing the intellectual structure of a scholarly field. Pathfinder network scaling has many advantages over traditional multidimensional scaling, including its ability to represent local details as well as global intellectual structure. However there are some limitations in PFNets including very high time complexity. And Pathfinder network scaling cannot be combined with cluster analysis, which has been combined well with traditional multidimensional scaling method. In this paper, a new method named as Parallel Nearest Neighbor Clustering (PNNC) are proposed for complementing those weak points of PFNets. Comparing the clustering performance with traditional hierarchical agglomerative clustering methods shows that PNNC is not only a complement to PFNets but also a fast and powerful clustering method for organizing informations.

The effect of social capital on firm performance within industrial clusters: Mediating role of organizational learning of clustering SMEs (산업클러스터 내 사회적 자본이 기업성과에 미치는 영향: 조직학습의 역할을 중심으로)

  • Kim, Shin-Woo;Seo, Ribin;Yoon, Heon-Deok
    • Knowledge Management Research
    • /
    • v.17 no.3
    • /
    • pp.65-91
    • /
    • 2016
  • Although the success of industrial clusters largely depends on whether clustering firms can achieve economic performance, there has been less attention on investigating factors and conditions contributing to the performance enhancement for clustering small and medium-sized enterprises (SMEs). Along this vein, we adopt the theories of social capital and organizational learning as those success factors for clustering SMEs. This study thus aims at examining what effect social capital accrued in the relationships among actors within clusters has on firm performance of clustering SMEs and what role organizational learning plays in the linkage between social capital and firm performance. For the empirical analysis, we operationalized the variables and their measures to develop questionnaires through the theoretical reviews on literatures. As a sample of 227 clustering SMEs, our collected data was analyzed by hierarchical regression analysis. The results confirmed that a high level of social capital, represented by network, trust, and norm, has positive effect on firm performance of clustering SMEs. We also found that clustering firms presenting high organizational learning, represented by absorptive and transformative capability, achieve better performance than those placing less value on organizational learning. Furthermore the significant relationship between social capital and firm performance is mediated partially through organizational learning. These findings imply not only that the territorial agglomeration of industrial cluster does not guarantee the performance creation of clustering SMEs but that they need to develop social capital among various actors within clusters, facilitating their knowledge diffusion. In order to absorb and mobilize the shared knowledge and information into strategic resources, the firms should improve their capability associated with organizational learning. These expand our understanding on the importance of social capital and organizational learning for the performance enhancement of clustering firms. Differentiating from major studies addressing benefits and advantages of industrial cluster, this study based on the perspective of firm-internal business process contributes to the literature advancement. Strategic and policy implications of this study are discussed in detail.

Sparse Document Data Clustering Using Factor Score and Self Organizing Maps (인자점수와 자기조직화지도를 이용한 희소한 문서데이터의 군집화)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.2
    • /
    • pp.205-211
    • /
    • 2012
  • The retrieved documents have to be transformed into proper data structure for the clustering algorithms of statistics and machine learning. A popular data structure for document clustering is document-term matrix. This matrix has the occurred frequency value of a term in each document. There is a sparsity problem in this matrix because most frequencies of the matrix are 0 values. This problem affects the clustering performance. The sparseness of document-term matrix decreases the performance of clustering result. So, this research uses the factor score by factor analysis to solve the sparsity problem in document clustering. The document-term matrix is transformed to document-factor score matrix using factor scores in this paper. Also, the document-factor score matrix is used as input data for document clustering. To compare the clustering performances between document-term matrix and document-factor score matrix, this research applies two typed matrices to self organizing map (SOM) clustering.