• Title/Summary/Keyword: Data Clustering

Search Result 2,730, Processing Time 0.04 seconds

Clustering fMRI Time Series using Self-Organizing Map (자기 조직 신경망을 이용한 기능적 뇌영상 시계열의 군집화)

  • 임종윤;장병탁;이경민
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2001.12a
    • /
    • pp.251-254
    • /
    • 2001
  • 본 논문에서는 Self Organizing Map을 이용하여 fMRI data를 분석해 보았다. fMRl (functional Magnetic Resonance Imaging)는 인간의 뇌에 대한 비 침투적 연구 방법 중 최근에 각광받고 있는 것이다. Motor task를 수행하고 있는 피험자로부터 image data를 얻어내어 SOM을 적용하여 clustering한 결과 motor cortex 영역이 뚜렷하게 clustering 되었음을 알 수 있었다.

  • PDF

On the Categorical Variable Clustering

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.2
    • /
    • pp.219-226
    • /
    • 1996
  • Basic objective in cluster analysis is to discover natural groupings of items or variables. In general, variable clustering was conducted based on some similarity measures between variables which have binary characteristics. We propose a variable clustering method when variables have more categories ordered in some sense. We also consider some measures of association as a similarity between variables. Numerical example is included.

  • PDF

The Design of GA-based TSK Fuzzy Classifier and Its application (GA기반 TSK 퍼지 분류기의 설계 및 응용)

  • 곽근창;김승석;유정웅;전명근
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2001.12a
    • /
    • pp.233-236
    • /
    • 2001
  • In this paper, we propose a TSK-type fuzzy classifier using PCA(Principal Component Analysis), FCM(Fuzzy C-Means) clustering and hybrid GA(genetic algorithm). First, input data is transformed to reduce correlation among the data components by PCA. FCM clustering is applied to obtain a initial TSK-type fuzzy classifier. Parameter identification is performed by AGA(Adaptive Genetic Algorithm) and RLSE(Recursive Least Square Estimate). we applied the proposed method to Iris data classification problems and obtained a better performance than previous works.

  • PDF

K-means clustering using a center of gravity for grid-based sample (그리드 기반 표본의 무게중심을 이용한 케이-평균군집화)

  • Lee, Sun-Myung;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.121-128
    • /
    • 2010
  • K-means clustering is an iterative algorithm in which items are moved among sets of clusters until the desired set is reached. K-means clustering has been widely used in many applications, such as market research, pattern analysis or recognition, image processing, etc. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters that we want, because it is more primitive, explorative. In this paper we propose a new method of k-means clustering using a center of gravity for grid-based sample. It is more fast than any traditional clustering method and maintains its accuracy.

The Design of Granular-based Radial Basis Function Neural Network by Context-based Clustering (Context-based 클러스터링에 의한 Granular-based RBF NN의 설계)

  • Park, Ho-Sung;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.6
    • /
    • pp.1230-1237
    • /
    • 2009
  • In this paper, we develop a design methodology of Granular-based Radial Basis Function Neural Networks(GRBFNN) by context-based clustering. In contrast with the plethora of existing approaches, here we promote a development strategy in which a topology of the network is predominantly based upon a collection of information granules formed on a basis of available experimental data. The output space is granulated making use of the K-Means clustering while the input space is clustered with the aid of a so-called context-based fuzzy clustering. The number of information granules produced for each context is adjusted so that we satisfy a certain reconstructability criterion that helps us minimize an error between the original data and the ones resulting from their reconstruction involving prototypes of the clusters and the corresponding membership values. In contrast to "standard" Radial Basis Function neural networks, the output neuron of the network exhibits a certain functional nature as its connections are realized as local linear whose location is determined by the values of the context and the prototypes in the input space. The other parameters of these local functions are subject to further parametric optimization. Numeric examples involve some low dimensional synthetic data and selected data coming from the Machine Learning repository.

Comparing Classification Accuracy of Ensemble and Clustering Algorithms Based on Taguchi Design (다구찌 디자인을 이용한 앙상블 및 군집분석 분류 성능 비교)

  • Shin, Hyung-Won;Sohn, So-Young
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.27 no.1
    • /
    • pp.47-53
    • /
    • 2001
  • In this paper, we compare the classification performances of both ensemble and clustering algorithms (Data Bagging, Variable Selection Bagging, Parameter Combining, Clustering) to logistic regression in consideration of various characteristics of input data. Four factors used to simulate the logistic model are (1) correlation among input variables (2) variance of observation (3) training data size and (4) input-output function. In view of the unknown relationship between input and output function, we use a Taguchi design to improve the practicality of our study results by letting it as a noise factor. Experimental study results indicate the following: When the level of the variance is medium, Bagging & Parameter Combining performs worse than Logistic Regression, Variable Selection Bagging and Clustering. However, classification performances of Logistic Regression, Variable Selection Bagging, Bagging and Clustering are not significantly different when the variance of input data is either small or large. When there is strong correlation in input variables, Variable Selection Bagging outperforms both Logistic Regression and Parameter combining. In general, Parameter Combining algorithm appears to be the worst at our disappointment.

  • PDF

Hierarchical Clustering of Symbolic Objects based on Asymmetric Proximity (비대칭적 유사도 기반의 심볼릭 객체의 계층적 클러스터링)

  • Oh, Seung-Joon;Park, Chan-Woong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.729-734
    • /
    • 2012
  • Clustering analysis has been widely used in numerous applications like pattern recognition, data analysis, intrusion detection, image processing, bioinformatics and so on. Much of previous work has been based on the numeric data only. However, symbolic data analysis has emerged to deal with variables that can have intervals, histograms, and even functions as values. In this paper, we propose a non symmetric proximity based clustering approach for symbolic objects. A method for clustering symbolic patterns based on the average similarity value(ASV) is explored. The results of the proposed clustering method differ from those of the existing methods and the results are very encouraging.

Model-based Clustering of DOA Data Using von Mises Mixture Model for Sound Source Localization

  • Dinh, Quang Nguyen;Lee, Chang-Hoon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.59-66
    • /
    • 2013
  • In this paper, we propose a probabilistic framework for model-based clustering of direction of arrival (DOA) data to obtain stable sound source localization (SSL) estimates. Model-based clustering has been shown capable of handling highly overlapped and noisy datasets, such as those involved in DOA detection. Although the Gaussian mixture model is commonly used for model-based clustering, we propose use of the von Mises mixture model as more befitting circular DOA data than a Gaussian distribution. The EM framework for the von Mises mixture model in a unit hyper sphere is degenerated for the 2D case and used as such in the proposed method. We also use a histogram of the dataset to initialize the number of clusters and the initial values of parameters, thereby saving calculation time and improving the efficiency. Experiments using simulated and real-world datasets demonstrate the performance of the proposed method.

Automatic Clustering of Speech Data Using Modified MAP Adaptation Technique (수정된 MAP 적응 기법을 이용한 음성 데이터 자동 군집화)

  • Ban, Sung Min;Kang, Byung Ok;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.77-83
    • /
    • 2014
  • This paper proposes a speaker and environment clustering method in order to overcome the degradation of the speech recognition performance caused by various noise and speaker characteristics. In this paper, instead of using the distance between Gaussian mixture model (GMM) weight vectors as in the Google's approach, the distance between the adapted mean vectors based on the modified maximum a posteriori (MAP) adaptation is used as a distance measure for vector quantization (VQ) clustering. According to our experiments on the simulation data generated by adding noise to clean speech, the proposed clustering method yields error rate reduction of 10.6% compared with baseline speaker-independent (SI) model, which is slightly better performance than the Google's approach.

Clustering Algorithm using a Center Of Gravity for Grid-based Sample

  • Park, Hee-Chang;Ryu, Jee-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2003.05a
    • /
    • pp.77-88
    • /
    • 2003
  • Cluster analysis has been widely used in many applications, such that data analysis, pattern recognition, image processing, etc. But clustering requires many hours to get clusters that we want, because it is more primitive, explorative and we make many data an object of cluster analysis. In this paper we propose a new clustering method, 'Clustering algorithm using a center of gravity for grid-based sample'. It is more fast than any traditional clustering method and maintains accuracy. It reduces running time by using grid-based sample and keeps accuracy by using representative point, a center of gravity.

  • PDF