• 제목/요약/키워드: Pattern clustering

검색결과 543건 처리시간 0.024초

Twostep Clustering of Environmental Indicator Survey Data

  • Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권1호
    • /
    • pp.1-11
    • /
    • 2006
  • Data mining technique is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. It has been widely used in many applications, such that pattern analysis or recognition, data analysis, image processing, market research on off-line or on-line and so on. We analyze Gyeongnam social indicator survey data by 2001 using twostep clustering technique for environment information. The twostep clustering is classified as a partitional clustering method. We can apply these twostep clustering outputs to environmental preservation and improvement.

  • PDF

퍼지 클러스터링기반 신경회로망 패턴 분류기의 학습 방법 비교 분석 (Comparative Analysis of Learning Methods of Fuzzy Clustering-based Neural Network Pattern Classifier)

  • 김은후;오성권;김현기
    • 전기학회논문지
    • /
    • 제65권9호
    • /
    • pp.1541-1550
    • /
    • 2016
  • In this paper, we introduce a novel learning methodology of fuzzy clustering-based neural network pattern classifier. Fuzzy clustering-based neural network pattern classifier depicts the patterns of given classes using fuzzy rules and categorizes the patterns on unseen data through fuzzy rules. Least squares estimator(LSE) or weighted least squares estimator(WLSE) is typically used in order to estimate the coefficients of polynomial function, but this study proposes a novel coefficient estimate method which includes advantages of the existing methods. The premise part of fuzzy rule depicts input space as "If" clause of fuzzy rule through fuzzy c-means(FCM) clustering, while the consequent part of fuzzy rule denotes output space through polynomial function such as linear, quadratic and their coefficients are estimated by the proposed local least squares estimator(LLSE)-based learning. In order to evaluate the performance of the proposed pattern classifier, the variety of machine learning data sets are exploited in experiments and through the comparative analysis of performance, it provides that the proposed LLSE-based learning method is preferable when compared with the other learning methods conventionally used in previous literature.

A Novel Image Segmentation Method Based on Improved Intuitionistic Fuzzy C-Means Clustering Algorithm

  • Kong, Jun;Hou, Jian;Jiang, Min;Sun, Jinhua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권6호
    • /
    • pp.3121-3143
    • /
    • 2019
  • Segmentation plays an important role in the field of image processing and computer vision. Intuitionistic fuzzy C-means (IFCM) clustering algorithm emerged as an effective technique for image segmentation in recent years. However, standard fuzzy C-means (FCM) and IFCM algorithms are sensitive to noise and initial cluster centers, and they ignore the spatial relationship of pixels. In view of these shortcomings, an improved algorithm based on IFCM is proposed in this paper. Firstly, we propose a modified non-membership function to generate intuitionistic fuzzy set and a method of determining initial clustering centers based on grayscale features, they highlight the effect of uncertainty in intuitionistic fuzzy set and improve the robustness to noise. Secondly, an improved nonlinear kernel function is proposed to map data into kernel space to measure the distance between data and the cluster centers more accurately. Thirdly, the local spatial-gray information measure is introduced, which considers membership degree, gray features and spatial position information at the same time. Finally, we propose a new measure of intuitionistic fuzzy entropy, it takes into account fuzziness and intuition of intuitionistic fuzzy set. The experimental results show that compared with other IFCM based algorithms, the proposed algorithm has better segmentation and clustering performance.

계층적 구조를 가진 퍼지 패턴 분류기 설계 (A Design of Fuzzy Classifier with Hierarchical Structure)

  • 안태천;노석범;김용수
    • 한국지능시스템학회논문지
    • /
    • 제24권4호
    • /
    • pp.355-359
    • /
    • 2014
  • 본 논문은 단순한 후반부 구조를 가진 퍼지 모델을 계층적 구조로 결합한 퍼지 패턴 분류기를 제안한다. 계층적 구조를 가진 퍼지 패턴 분류기의 기본 구조는 단순한 후반부 구조를 가진 퍼지 모델을 사용하여 전체 패턴 분류기의 구조적 복잡성을 높이지 않도록 설계 하였다. 입력공간을 계층적으로 분할하기 위하여 대표적인 퍼지 클러스터링 알고리즘인 Fuzzy C-Means clustering 기법을 이용하였다. 분할된 퍼지 입력 공간의 하위 구조를 분석하기 위하여 conditional Fuzzy C-Means 클러스터링 기법을 이용하였다. 계층적으로 분할된 퍼지 입력공간에 간단한 구조를 가진 퍼지 패턴 분류기를 적용하여 계층적 구조를 가진 패턴 분류기를 설계한다. 계층적으로 퍼지 모델들을 결합함으로써 입력 공간의 정보 분석을 거시적인 관점에서 시작하여 세부적으로 분석이 가능하게 되었다. 제안된 퍼지 패턴 분류기의 성능을 평가하기 위하여 다양한 기계 학습 데이터를 사용하였다.

剩餘數體系를 이용한 자승오차 패턴 클러스터링 프로세서의 실현 (Implementation of the Squared-Error Pattern Clustering Processor Using the Residue Number System)

  • 김형민;조원경
    • 대한전자공학회논문지
    • /
    • 제26권2호
    • /
    • pp.87-93
    • /
    • 1989
  • 패턴인식과 영상처리 응용에 이용되는 자승오차 패턴 클러스터링 알고리듬은 특징벡터 행렬의 연산에 상당한 처리시간은 요구한다. 그러므로 본 논문은 병렬처리와 파이프라인 특성을 갖는 잉여수체계를 이용한 고속의 자승오차 패턴 클러스터링 프로세서를 제안한다. 제안된 자승오차 패턴 클러스터링 프로세서는 영상분할 실험으로부터 의미있는 영역으로 나눌 수 있는 클러스터의 수에 대하여 만족할 만한 오차를 보이며 80287 수치 연산용 프로세서보다 약 200배 빠름을 보인다. 그 결과 대규모의 데이타를 실시간으로 처리하여야 하는 응용분야에 효과적으로 이용할 수 있음을 확인하였다.

  • PDF

SAHN 모델의 부분적 패턴 추정 방법에 대한 연구 (A Study on Partial Pattern Estimation for Sequential Agglomerative Hierarchical Nested Model)

  • 장경원;안태천
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2005년도 학술대회 논문집 정보 및 제어부문
    • /
    • pp.143-145
    • /
    • 2005
  • In this paper, an empirical study result on pattern estimation method is devoted to reveal underlying data patterns with a relatively reduced computational cost. Presented method performs crisp type clustering with given n number of data samples by means of the sequential agglomerative hierarchical nested model (SAHN). Conventional SAHN based clustering requires large computation time in the initial step of algorithm. To deal with this concern, we modified overall process with a partial approach. In the beginning of this method, we divide given data set to several sub groups with uniform sampling and then each divided sub data group is applied to SAHN based method. The advantage of this method reduces computation time of original process and gives similar results. Proposed is applied to several test data set and simulation result with conceptual analysis is presented.

  • PDF

A Comparison of Clustering Algorithm in Data Mining

  • Lee, Yung-Seop;An, Mi-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • 제14권4호
    • /
    • pp.725-736
    • /
    • 2003
  • To provide the information needed to make a decision, it is important to know the relationship or pattern between variables in database. Grouping objects which have similar characteristics of pattern is called as cluster analysis, one of data mining techniques. In this study, it is compared with several partitioning clustering algorithms, based on the statistical distance or total variance in each cluster.

  • PDF

시간경로 유전자 발현자료의 군집분석에서 이질적인 시계열의 탐지를 위한 패턴일치지수 (A Pattern Consistency Index for Detecting Heterogeneous Time Series in Clustering Time Course Gene Expression Data)

  • 손영숙;백장선
    • 응용통계연구
    • /
    • 제18권2호
    • /
    • pp.371-379
    • /
    • 2005
  • 본 논문에서는 피어슨 상관계수를 이용한 시간경로 유전자 발현자료의 군집분석에서 군집의 대표적인 패턴에서 벗어나는 이질적인 패턴을 보이는 시계열을 탐지하기 위한 패턴일치지수를 제안하고, 이를 마이크로어레이 실험으로부터 얻어진 혈청 시간경로 유전자 발현자료에 적용하여 유용성을 검토해 본다.

Inverted Index based Modified Version of K-Means Algorithm for Text Clustering

  • Jo, Tae-Ho
    • Journal of Information Processing Systems
    • /
    • 제4권2호
    • /
    • pp.67-76
    • /
    • 2008
  • This research proposes a new strategy where documents are encoded into string vectors and modified version of k means algorithm to be adaptable to string vectors for text clustering. Traditionally, when k means algorithm is used for pattern classification, raw data should be encoded into numerical vectors. This encoding may be difficult, depending on a given application area of pattern classification. For example, in text clustering, encoding full texts given as raw data into numerical vectors leads to two main problems: huge dimensionality and sparse distribution. In this research, we encode full texts into string vectors, and modify the k means algorithm adaptable to string vectors for text clustering.

Gene Expression Pattern Analysis via Latent Variable Models Coupled with Topographic Clustering

  • Chang, Jeong-Ho;Chi, Sung Wook;Zhang, Byoung Tak
    • Genomics & Informatics
    • /
    • 제1권1호
    • /
    • pp.32-39
    • /
    • 2003
  • We present a latent variable model-based approach to the analysis of gene expression patterns, coupled with topographic clustering. Aspect model, a latent variable model for dyadic data, is applied to extract latent patterns underlying complex variations of gene expression levels. Then a topographic clustering is performed to find coherent groups of genes, based on the extracted latent patterns as well as individual gene expression behaviors. Applied to cell cycle­regulated genes of the yeast Saccharomyces cerevisiae, the proposed method could discover biologically meaningful patterns related with characteristic expression behavior in particular cell cycle phases. In addition, the display of the variation in the composition of these latent patterns on the cluster map provided more facilitated interpretation of the resulting cluster structure. From this, we argue that latent variable models, coupled with topographic clustering, are a promising tool for explorative analysis of gene expression data.