통합 검색 | Korea Science

Performance evaluation of principal component analysis for clustering problems

Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
- Journal of Advanced Marine Engineering and Technology
- /
- 제40권8호
- /
- pp.726-732
- /
- 2016
Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.
https://doi.org/10.5916/jkosme.2016.40.8.726 인용 PDF KSCI

Automatic Switching of Clustering Methods based on Fuzzy Inference in Bibliographic Big Data Retrieval System

Zolkepli, Maslina;Dong, Fangyan;Hirota, Kaoru
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- 제14권4호
- /
- pp.256-267
- /
- 2014
An automatic switch among ensembles of clustering algorithms is proposed as a part of the bibliographic big data retrieval system by utilizing a fuzzy inference engine as a decision support tool to select the fastest performing clustering algorithm between fuzzy C-means (FCM) clustering, Newman-Girvan clustering, and the combination of both. It aims to realize the best clustering performance with the reduction of computational complexity from O($n^3$) to O(n). The automatic switch is developed by using fuzzy logic controller written in Java and accepts 3 inputs from each clustering result, i.e., number of clusters, number of vertices, and time taken to complete the clustering process. The experimental results on PC (Intel Core i5-3210M at 2.50 GHz) demonstrates that the combination of both clustering algorithms is selected as the best performing algorithm in 20 out of 27 cases with the highest percentage of 83.99%, completed in 161 seconds. The self-adapted FCM is selected as the best performing algorithm in 4 cases and the Newman-Girvan is selected in 3 cases.The automatic switch is to be incorporated into the bibliographic big data retrieval system that focuses on visualization of fuzzy relationship using hybrid approach combining FCM and Newman-Girvan algorithm, and is planning to be released to the public through the Internet.
https://doi.org/10.5391/IJFIS.2014.14.4.256 인용 PDF KSCI KPUBS HTML

클러스터링 성능평가: 신경망 및 통계적 방법 (A Study on Performance Evaluation of Clustering Algorithms using Neural and Statistical Method)

윤석환;신용백
- 기술사
- /
- 제29권2호
- /
- pp.71-79
- /
- 1996
This paper evaluates the clustering performance of a neural network and a statistical method. Algorithms which are used in this paper are the GLVQ(Generalized Loaming vector Quantization) for a neural method and the k -means algorithm for a statistical clustering method. For comparison of two methods, we calculate the Rand's c statistics. As a result, the mean of c value obtained with the GLVQ is higher than that obtained with the k -means algorithm, while standard deviation of c value is lower. Experimental data sets were the Fisher's IRIS data and patterns extracted from handwritten numerals.
PDF

Design of improved Mulit-FNN for Nonlinear Process modeling

Park, Hosung;Sungkwun Oh
- 제어로봇시스템학회:학술대회논문집
- /
- 제어로봇시스템학회 2002년도 ICCAS
- /
- pp.102.2-102
- /
- 2002
In this paper, the improved Multi-FNN (Fuzzy-Neural Networks) model is identified and optimized using HCM (Hard C-Means) clustering method and optimization algorithms. The proposed Multi-FNN is based on FNN and use simplified and linear inference as fuzzy inference method and error back propagation algorithm as learning rules. We use a HCM clustering and genetic algorithms (GAs) to identify both the structure and the parameters of a Multi-FNN model. Here, HCM clustering method, which is carried out for the process data preprocessing of system modeling, is utilized to determine the structure of Multi-FNN according to the divisions of input-output space using I/O process data. Also, the parame...
PDF

데이터 레코드의 Clustering Algorithms

문송천
- 정보과학회지
- /
- 제5권2호
- /
- pp.90-93
- /
- 1987
Hardware의 발전 중에서도 memory 기술의 발전은 눈부실 程度이다. 記憶裝置의 構成素子의 흐름을 보면 1世代에는 眞空管이 使用되었고 2世代에는 트 랜지스터가 使用되었으며 3世代에는 Integrated circuits이 使用되고 있다. 3世代後期인 오늘날에는 LSI(Large scale integration) 기술이 發展됨에 따라 memory 의 價格은 날로 떨어지고 있다. 또한 microprogramming의 槪念이 登場하여 ROM(Read-Only Memory)의 出現을 보았고, 더 나아가서는 IBMS/360 Modal 85 에 cache memory가 登場하여 memory hierarchy上에 一大革新을 일으켰다.

계층 구조 클러스터링 알고리즘 설계 및 그 응용 (Design of Hierarchically Structured Clustering Algorithm and its Application)

방영근;박하용;이철희
- 산업기술연구
- /
- 제29권B호
- /
- pp.17-23
- /
- 2009
In many cases, clustering algorithms have been used for extracting and discovering useful information from non-linear data. They have made a great effect on performances of the systems dealing with non-linear data. Thus, this paper presents a new approach called hierarchically structured clustering algorithm, and it is applied to the prediction system for non-linear time series data. The proposed hierarchically structured clustering algorithm (called HCKA: Hierarchical Cross-correlation and K-means clustering Algorithms) in which the cross-correlation and k-means clustering algorithm are combined can accept the correlationship of non-linear time series as well as statistical characteristics. First, the optimal differences of data are generated, which can suitably reveal the characteristics of non-linear time series. Second, the generated differences are classified into the upper clusters for their predictors by the cross-correlation clustering algorithm, and then each classified differences are classified again into the lower fuzzy sets by the k-means clustering algorithm. As a result, the proposed method can give an efficient classification and improve the performance. Finally, we demonstrates the effectiveness of the proposed HCKA via typical time series examples.
PDF

Evaluating the Performance of Four Selections in Genetic Algorithms-Based Multispectral Pixel Clustering

Kutubi, Abdullah Al Rahat;Hong, Min-Gee;Kim, Choen
- 대한원격탐사학회지
- /
- 제34권1호
- /
- pp.151-166
- /
- 2018
This paper compares the four selections of performance used in the application of genetic algorithms (GAs) to automatically optimize multispectral pixel cluster for unsupervised classification from KOMPSAT-3 data, since the selection among three main types of operators including crossover and mutation is the driving force to determine the overall operations in the clustering GAs. Experimental results demonstrate that the tournament selection obtains a better performance than the other selections, especially for both the number of generation and the convergence rate. However, it is computationally more expensive than the elitism selection with the slowest convergence rate in the comparison, which has less probability of getting optimum cluster centers than the other selections. Both the ranked-based selection and the proportional roulette wheel selection show similar performance in the average Euclidean distance using the pixel clustering, even the ranked-based is computationally much more expensive than the proportional roulette. With respect to finding global optimum, the tournament selection has higher potential to reach the global optimum prior to the ranked-based selection which spends a lot of computational time in fitness smoothing. The tournament selection-based clustering GA is used to successfully classify the KOMPSAT-3 multispectral data achieving the sufficient the matic accuracy assessment (namely, the achieved Kappa coefficient value of 0.923).
https://doi.org/10.7780/kjrs.2018.34.1.11 인용 PDF KSCI HTML

무선 센서 네트워크에서 에너지 효율적인 감시·정찰 응용의 클러스터링 알고리즘 연구 (Energy Efficient Clustering Algorithm for Surveillance and Reconnaissance Applications in Wireless Sensor Networks)

공준익;이재호;강지헌;엄두섭
- 한국통신학회논문지
- /
- 제37C권11호
- /
- pp.1170-1181
- /
- 2012
다양한 응용에서 사용되고 있는 무선 센서 네트워크(WSN)는 저가의 센서 노드를 구성하기 위해 배터리, 메모리 크기, MCU, RF transceiver 등과 같은 하드웨어에서 제약을 갖고 있다. 특히, 센서 노드의 제한된 에너지는 네트워크 수명과 직접적인 관련이 있기 때문에 네트워크 수명을 연장하기 위한 효율적인 알고리즘이 요구된다. 군 환경에서 침입자를 탐지하기 위한 감시 정찰 응용은 이벤트 구동형(event-driven) 전송 모델로써, 이벤트 발생 빈도가 드물고(rare), 폭발적(bursty), 지역적(local)으로 발생하는 특징이 있다. 이와 같은 응용에서는 Data Aggregation의 장점이 있는 클러스터링 알고리즘을 이용하는 것이 각 노드가 개별적으로 데이터를 전송하는 것 보다 데이터 전송량을 줄여 에너지 효율을 높일 수 있다. 하지만 기존의 클러스터링 알고리즘은 감시 정찰 응용의 이벤트 발생에 대한 특징을 고려하고 있지 않기 때문에 여러 문제가 발생한다. 본 논문에서는 이러한 문제를 개선한 감시 정찰 응용에서의 에너지 효율적인 클러스터링 알고리즘을 제안한다. 이 알고리즘은 타깃을 탐지한 노드들이 각각 Cluster Head Election Window (CHEW)를 생성하여 지역적 경쟁 방식으로 클러스터를 구성하고, 타깃의 이동성을 고려하였다. 시뮬레이션 결과에서는 타깃의 이동에 따라 클러스터가 형성되는 자취를 분석하고, 에너지 효율이 증가되는 것을 증명하였다.
https://doi.org/10.7840/kics.2012.37C.11.1170 인용 PDF KSCI

Online Clustering Algorithms for Semantic-Rich Network Trajectories

Roh, Gook-Pil;Hwang, Seung-Won
- Journal of Computing Science and Engineering
- /
- 제5권4호
- /
- pp.346-353
- /
- 2011
With the advent of ubiquitous computing, a massive amount of trajectory data has been published and shared in many websites. This type of computing also provides motivation for online mining of trajectory data, to fit user-specific preferences or context (e.g., time of the day). While many trajectory clustering algorithms have been proposed, they have typically focused on offline mining and do not consider the restrictions of the underlying road network and selection conditions representing user contexts. In clear contrast, we study an efficient clustering algorithm for Boolean + Clustering queries using a pre-materialized and summarized data structure. Our experimental results demonstrate the efficiency and effectiveness of our proposed method using real-life trajectory data.
https://doi.org/10.5626/JCSE.2011.5.4.346 인용 PDF KPUBS

퍼지 Clustering 알고리즘을 이용한 휘발성 화학물질의 분류 (Classification of Volatile Chemicals using Fuzzy Clustering Algorithm)

변형기;김갑일
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 1996년도 하계학술대회 논문집 B
- /
- pp.1042-1044
- /
- 1996
The use of fuzzy theory in task of pattern recognition may be applicable gases and odours classification and recognition. This paper reports results obtained from fuzzy c-means algorithms to patterns generated by odour sensing system using an array of conducting polymer sensors, for volatile chemicals. For the volatile chemicals clustering problem, the three unsupervise fuzzy c-means algorithms were applied. From among the pattern clustering methods, the FCMAW algorithm, which updated the cluster centres more frequently, consistently outperformed. It has been confirmed as an outstanding clustering algorithm throughout experimental trials.
PDF

검색결과 608건 처리시간 0.025초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)