• Title/Summary/Keyword: Over-clustering

Search Result 385, Processing Time 0.022 seconds

Support Vector Machine based Cluster Merging (Support Vector Machines 기반의 클러스터 결합 기법)

  • Choi, Byung-In;Rhee, Frank Chung-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.3
    • /
    • pp.369-374
    • /
    • 2004
  • A cluster merging algorithm that merges convex clusters resulted by the Fuzzy Convex Clustering(FCC) method into non-convex clusters was proposed. This was achieved by proposing a fast and reliable distance measure between two convex clusters using Support Vector Machines(SVM) to improve accuracy and speed over other existing conventional methods. In doing so, it was possible to reduce cluster number without losing its representation of the data. In this paper, results for several data sets are given to show the validity of our distance measure and algorithm.

Fuzzy Controller Modeling for Electromagnetic Levitation Systems based on Clustering Algorithm (클러스터링에 기초한 자기부상시스템의 퍼지제어기 모델링)

  • Kim, Min-Soo;Byun, Yeun-Sub;Lee, Kwan-Sup
    • Proceedings of the KSR Conference
    • /
    • 2006.11a
    • /
    • pp.145-159
    • /
    • 2006
  • This paper describes the development of a clustering based fuzzy controller of an electromagnetic suspension vehicle using gain scheduling method and Kalman filter for a simplified single magnet system. Electromagnetic suspension vehicle systems are highly nonlinear and essentially unstable systems For achieving the levitation control of the DC electromagnetic suspension system, we considered a fuzzy system modeling method based on clustering algorithm which a set of input/output data is collected from the well defined Linear Quadratic Gaussian(LQG) controller. Simulation results show that the proposed clustering based fuzzy controller methodology robustly yields uniform performance with adequate gap response over the mass variation range.

  • PDF

$F_n$-Measure : An External Cluster Evaluation Measure (클러스터 평가 외부기준 척도 $F_n$-Measure)

  • Kim, Kyeongtaek
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.35 no.4
    • /
    • pp.244-248
    • /
    • 2012
  • F-Measure is one of the external measures for evaluating the validity of clustering results. Though it has clear advantages over other widely used external measures such as Purity and Entropy, F-Measure has inherently been less sensitive than other validity measures. This insensitivity owes to the definition of F-Measure that counts only most influential portions. In this research, we present $F_n$-Measure, an external cluster evaluation measure based on F-Measure. $F_n$-Measure is so sensitive that it can detect their difference in the cases that F-Measure cannot detect the difference in clustering results. We compare $F_n$-Measure to F-Measure for a few clustering results and show which measure draws better result based upon homogeneity and completeness.

Discovering Community Interests Approach to Topic Model with Time Factor and Clustering Methods

  • Ho, Thanh;Thanh, Tran Duy
    • Journal of Information Processing Systems
    • /
    • v.17 no.1
    • /
    • pp.163-177
    • /
    • 2021
  • Many methods of discovering social networking communities or clustering of features are based on the network structure or the content network. This paper proposes a community discovery method based on topic models using a time factor and an unsupervised clustering method. Online community discovery enables organizations and businesses to thoroughly understand the trend in users' interests in their products and services. In addition, an insight into customer experience on social networks is a tremendous competitive advantage in this era of ecommerce and Internet development. The objective of this work is to find clusters (communities) such that each cluster's nodes contain topics and individuals having similarities in the attribute space. In terms of social media analytics, the method seeks communities whose members have similar features. The method is experimented with and evaluated using a Vietnamese corpus of comments and messages collected on social networks and ecommerce sites in various sectors from 2016 to 2019. The experimental results demonstrate the effectiveness of the proposed method over other methods.

Research on the Development of Distance Metrics for the Clustering of Vessel Trajectories in Korean Coastal Waters (국내 연안 해역 선박 항적 군집화를 위한 항적 간 거리 척도 개발 연구)

  • Seungju Lee;Wonhee Lee;Ji Hong Min;Deuk Jae Cho;Hyunwoo Park
    • Journal of Navigation and Port Research
    • /
    • v.47 no.6
    • /
    • pp.367-375
    • /
    • 2023
  • This study developed a new distance metric for vessel trajectories, applicable to marine traffic control services in the Korean coastal waters. The proposed metric is designed through the weighted summation of the traditional Hausdorff distance, which measures the similarity between spatiotemporal data and incorporates the differences in the average Speed Over Ground (SOG) and the variance in Course Over Ground (COG) between two trajectories. To validate the effectiveness of this new metric, a comparative analysis was conducted using the actual Automatic Identification System (AIS) trajectory data, in conjunction with an agglomerative clustering algorithm. Data visualizations were used to confirm that the results of trajectory clustering, with the new metric, reflect geographical distances and the distribution of vessel behavioral characteristics more accurately, than conventional metrics such as the Hausdorff distance and Dynamic Time Warping distance. Quantitatively, based on the Davies-Bouldin index, the clustering results were found to be superior or comparable and demonstrated exceptional efficiency in computational distance calculation.

Min-Distance Hop Count based Multi-Hop Clustering In Non-uniform Wireless Sensor Networks

  • Kim, Eun-Ju;Kim, Dong-Joo;Park, Jun-Ho;Seong, Dong-Ook;Lee, Byung-Yup;Yoo, Jae-Soo
    • International Journal of Contents
    • /
    • v.8 no.2
    • /
    • pp.13-18
    • /
    • 2012
  • In wireless sensor networks, an energy efficient data gathering scheme is one of core technologies to process a query. The cluster-based data gathering methods minimize the energy consumption of sensor nodes by maximizing the efficiency of data aggregation. However, since the existing clustering methods consider only uniform network environments, they are not suitable for the real world applications that sensor nodes can be distributed unevenly. To solve such a problem, we propose a balanced multi-hop clustering scheme in non-uniform wireless sensor networks. The proposed scheme constructs a cluster based on the logical distance to the cluster head using a min-distance hop count. To show the superiority of our proposed scheme, we compare it with the existing clustering schemes in sensor networks. Our experimental results show that our proposed scheme prolongs about 48% lifetime over the existing methods on average.

A Clustering for Ground Nodes of HAPS Network (HAP 네트워크 지상 노드의 클러스터링)

  • Song, Ha-Yoon
    • Journal of Digital Contents Society
    • /
    • v.9 no.1
    • /
    • pp.87-99
    • /
    • 2008
  • High Altitude Platform network systems utilize Unmanned Aerial Vehicle as routers for ground node communication. For this purpose, geographical clustering of ground nodes must be required. In this paper, we assume mobile ground nodes over wide area and the clusters composed of ground nodes are identified. UAVs can be positioned at the point of centroid of clusters. The number of UAVs are derived from the area size and the number of ground nodes deployed in that area. From the simulation and application of clustering algorithms, we showed visual clustering results with dynamic variance of number of ground nodes.

  • PDF

Development of Datamining Roadmap and Its Application to Water Treatment Plant for Coagulant Control (데이터마이닝 로드맵 개발과 수처리 응집제 제어를 위한 데이터마이닝 적용)

  • Bae, Hyeon;Kim, Sung-Shin;Kim, Ye-Jin
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.9 no.7
    • /
    • pp.1582-1587
    • /
    • 2005
  • In coagulant control of water treatment plants, rule extraction, one of datamining categories, was performed for coagulant control of a water treatment plant. Clustering methods were applied to extract control rules from data. These control rules can be used for fully automation of water treatment plants instead of operator's knowledge for plant control. To perform fuzzy clustering, there are some coefficients to be determined and these kinds of studies have been performed over decades such as clustering indices. In this study, statistical indices were taken to calculate the number of clusters. Simultaneously, seed points were found out based on hierarchical clustering. These statistical approaches give information about features of clusters, so it can reduce computing cost and increase accuracy of clustering. The proposed algorithm can play an important role in datamining and knowledge discovery.

A Classification Algorithm Based on Data Clustering and Data Reduction for Intrusion Detection System over Big Data

  • Wang, Qiuhua;Ouyang, Xiaoqin;Zhan, Jiacheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.7
    • /
    • pp.3714-3732
    • /
    • 2019
  • With the rapid development of network, Intrusion Detection System(IDS) plays a more and more important role in network applications. Many data mining algorithms are used to build IDS. However, due to the advent of big data era, massive data are generated. When dealing with large-scale data sets, most data mining algorithms suffer from a high computational burden which makes IDS much less efficient. To build an efficient IDS over big data, we propose a classification algorithm based on data clustering and data reduction. In the training stage, the training data are divided into clusters with similar size by Mini Batch K-Means algorithm, meanwhile, the center of each cluster is used as its index. Then, we select representative instances for each cluster to perform the task of data reduction and use the clusters that consist of representative instances to build a K-Nearest Neighbor(KNN) detection model. In the detection stage, we sort clusters according to the distances between the test sample and cluster indexes, and obtain k nearest clusters where we find k nearest neighbors. Experimental results show that searching neighbors by cluster indexes reduces the computational complexity significantly, and classification with reduced data of representative instances not only improves the efficiency, but also maintains high accuracy.

Constraints on cosmology and baryonic feedback by the combined analysis of weak lensing and galaxy clustering with the Deep Lens Survey

  • Yoon, Mijin;Jee, M. James;Tyson, Tony
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.43 no.2
    • /
    • pp.41.1-41.1
    • /
    • 2018
  • We constrain cosmological parameters by combining three different power spectra measured from galaxy clustering, galaxy-galaxy lensing, and cosmic shear using the Deep Lens Survey (DLS). Two lens bins (centered at z~0.27 and 0.54) and two source bins (centered at z~0.64, and 1.1) containing more than one million galaxies are selected to measure the power spectra. We re-calibrate the initial photo-z estimation of the lens bins by matching with SHELS and PRIMUS and confirm its fidelity by measuring a cross-correlation between the bins. We also check the reliability of the lensing signals through the null tests, lens-source flipping and cross shear measurement. Residual systematic errors from photometric redshift and shear calibration uncertainties are marginalized over in the nested sampling during our parameter constraint process. For the flat LCDM model, we determine S_8=sigma_8(Omega_m/0.3)^0.5=0.832+-0.028, which is in great agreement with the Planck data. We also verify that the two independent constraints from the cosmic shear and the galaxy clustering+galaxy-galaxy lensing measurements are consistent with each other. To address baryonic feedback effects on small scales, we marginalize over a baryonic feedback parameter, which we are able to constrain with the DLS data alone and more tightly when combined with Planck data. The constrained value hints at the possibility that the AGN feedback in the current OWLS simulations might not be strong enough.

  • PDF