• Title/Summary/Keyword: Clustering Technique

Search Result 711, Processing Time 0.024 seconds

A Study of Designing the Han-Guel Thesaurus Browser for Automatic Information Retrieval (자동정보검색을 위한 한글 시소러스 브라우저 구축에 관한 연구)

  • Seo, Whee
    • Journal of Korean Library and Information Science Society
    • /
    • v.31 no.2
    • /
    • pp.279-302
    • /
    • 2000
  • This study is to develop a new automatic system for the Korean thesaurus browser by which we can automatically control all the processes of searching queries such as, representation, generation, extension and construction of searching strategy and feedback searching. The system in this study is programmed by Delphi 4.0(PASCAL) and consists of database system, automatic indexing, clustering technique, establishing and expressing thesaurus, and automatic information retrieval technique. The results proved by this system are as follows: 1)By using the new automatic thesaurus browser developed by the new algorithm, we can perform information retrieval, automatic indexing, clustering technique, establishing and expressing thesaurus, information retrieval technique, and retrieval feedback. Thus it turns out that even the beginner user can easily access special terms about the field of a specific subject. 2) The thesaurus browser in this paper has such merits as the easiness of establishing, the convenience of using, and the good results of information retrieval in terms of the rate of speed, degree, and regeneration. Thus, it t m out very pragmatic.

  • PDF

Range-Doppler Clustering of Radar Data for Detecting Moving Objects (이동물체 탐지를 위한 레이다 데이터의 거리-도플러 클러스터링 기법)

  • Kim, Seongjoon;Yang, Dongwon;Jung, Younghun;Kim, Sujin;Yoon, Joohong
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.17 no.6
    • /
    • pp.810-820
    • /
    • 2014
  • Recently many studies of Radar systems mounted on ground vehicles for autonomous driving, SLAM (Simultaneous localization and mapping) and collision avoidance are reported. In near field, several hits per an object are generated after signal processing of Radar data. Hence, clustering is an essential technique to estimate their shapes and positions precisely. This paper proposes a method of grouping hits in range-doppler domains into clusters which represent each object, according to the pre-defined rules. The rules are based on the perceptual cues to separate hits by object. The morphological connectedness between hits and the characteristics of SNR distribution of hits are adopted as the perceptual cues for clustering. In various simulations for the performance assessment, the proposed method yielded more effective performance than other techniques.

An Efficient Large Graph Clustering Technique based on Min-Hash (Min-Hash를 이용한 효율적인 대용량 그래프 클러스터링 기법)

  • Lee, Seok-Joo;Min, Jun-Ki
    • Journal of KIISE
    • /
    • v.43 no.3
    • /
    • pp.380-388
    • /
    • 2016
  • Graph clustering is widely used to analyze a graph and identify the properties of a graph by generating clusters consisting of similar vertices. Recently, large graph data is generated in diverse applications such as Social Network Services (SNS), the World Wide Web (WWW), and telephone networks. Therefore, the importance of graph clustering algorithms that process large graph data efficiently becomes increased. In this paper, we propose an effective clustering algorithm which generates clusters for large graph data efficiently. Our proposed algorithm effectively estimates similarities between clusters in graph data using Min-Hash and constructs clusters according to the computed similarities. In our experiment with real-world data sets, we demonstrate the efficiency of our proposed algorithm by comparing with existing algorithms.

An eigenspace projection clustering method for structural damage detection

  • Zhu, Jun-Hua;Yu, Ling;Yu, Li-Li
    • Structural Engineering and Mechanics
    • /
    • v.44 no.2
    • /
    • pp.179-196
    • /
    • 2012
  • An eigenspace projection clustering method is proposed for structural damage detection by combining projection algorithm and fuzzy clustering technique. The integrated procedure includes data selection, data normalization, projection, damage feature extraction, and clustering algorithm to structural damage assessment. The frequency response functions (FRFs) of the healthy and the damaged structure are used as initial data, median values of the projections are considered as damage features, and the fuzzy c-means (FCM) algorithm are used to categorize these features. The performance of the proposed method has been validated using a three-story frame structure built and tested by Los Alamos National Laboratory, USA. Two projection algorithms, namely principal component analysis (PCA) and kernel principal component analysis (KPCA), are compared for better extraction of damage features, further six kinds of distances adopted in FCM process are studied and discussed. The illustrated results reveal that the distance selection depends on the distribution of features. For the optimal choice of projections, it is recommended that the Cosine distance is used for the PCA while the Seuclidean distance and the Cityblock distance suitably used for the KPCA. The PCA method is recommended when a large amount of data need to be processed due to its higher correct decisions and less computational costs.

A Multiple Sequence Alignment Algorithm using Clustering Divergence (콜러스터링 분기를 이용한 다중 서열 정렬 알고리즘)

  • Lee Byung-ll;Lee Jong-Yun;Jung Soon-Key
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.1-10
    • /
    • 2005
  • Multiple sequence alignment(MSA) is a fundamental technique of DNA and Protein sequence analysis. Biological sequences are aligned vertically in order to show the similarities and differences among them. In this Paper, we Propose an effcient group alignment method, which is based on clustering divergency, to Perform the alignment between two groups of sequences. The Proposed algorithm is a clustering divergence(CDMS)-based multiple sequence alignment and a top-down approach. The algorithm builds the tree topology for merging. It is so based on the concept that two sequences having the longest distance should be spilt into two clusters. We expect that our sequence alignment algorithm improves its qualify and speeds up better than traditional algorithm Clustal-W.

  • PDF

SVM based Clustering Technique for Processing High Dimensional Data (고차원 데이터 처리를 위한 SVM기반의 클러스터링 기법)

  • Kim, Man-Sun;Lee, Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.816-820
    • /
    • 2004
  • Clustering is a process of dividing similar data objects in data set into clusters and acquiring meaningful information in the data. The main issues related to clustering are the effective clustering of high dimensional data and optimization. This study proposed a method of measuring similarity based on SVM and a new method of calculating the number of clusters in an efficient way. The high dimensional data are mapped to Feature Space ones using kernel functions and then similarity between neighboring clusters is measured. As for created clusters, the desired number of clusters can be got using the value of similarity measured and the value of Δd. In order to verify the proposed methods, the author used data of six UCI Machine Learning Repositories and obtained the presented number of clusters as well as improved cohesiveness compared to the results of previous researches.

Scalable Search based on Fuzzy Clustering for Interest-based P2P Networks

  • Mateo, Romeo Mark A.;Lee, Jae-Wan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.5 no.1
    • /
    • pp.157-176
    • /
    • 2011
  • An interest-based P2P constructs the peer connections based on similarities for efficient search of resources. A clustering technique using peer similarities as data is an effective approach to group the most relevant peers. However, the separation of groups produced from clustering lowers the scalability of a P2P network. Moreover, the interest-based approach is only concerned with user-level grouping where topology-awareness on the physical network is not considered. This paper proposes an efficient scalable search for the interest-based P2P system. A scalable multi-ring (SMR) based on fuzzy clustering handles the grouping of relevant peers and the proposed scalable search utilizes the SMR for scalability of peer queries. In forming the multi-ring, a minimized route function is used to determine the shortest route to connect peers on the physical network. Performance evaluation showed that the SMR acquired an accurate peer grouping and improved the connectivity rate of the P2P network. Also, the proposed scalable search was efficient in finding more replicated files throughout the peer network compared to other traditional P2P approaches.

An Incremental Clustering Technique of XML Documents using Cluster Histograms (클러스터의 히스토그램을 이용한 XML 문서의 점진적 클러스터링 기법)

  • Hwang, Jeong-Hee
    • Journal of KIISE:Databases
    • /
    • v.34 no.3
    • /
    • pp.261-269
    • /
    • 2007
  • As a basic research to integrate and to retrieve XML documents efficiently, this paper proposes a clustering method by structures of XML documents. We apply an algorithm processing the many transaction data to the clustering of XML documents, which is a quite different method from the previous algorithms measuring structure similarity. Our method performs the clustering of XML documents not only using the cluster histograms that represent the distribution of items in clusters but also considering the global cluster cohesion. We compare the proposed method with the existing techniques by performing experiments. Experiments show that our method not only creates good quality clusters but also improves the processing time.

High-Dimensional Clustering Technique using Incremental Projection (점진적 프로젝션을 이용한 고차원 글러스터링 기법)

  • Lee, Hye-Myung;Park, Young-Bae
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.568-576
    • /
    • 2001
  • Most of clustering algorithms data to degenerate rapidly on high dimensional spaces. Moreover, high dimensional data often contain a significant a significant of noise. which causes additional ineffectiveness of algorithms. Therefore it is necessary to develop algorithms adapted to the structure and characteristics of the high dimensional data. In this paper, we propose a clustering algorithms CLIP using the projection The CLIP is designed to overcome efficiency and/or effectiveness problems on high dimensional clustering and it is the is based on clustering on each one dimensional subspace but we use the incremental projection to recover high dimensional cluster and to reduce the computational cost significantly at time To evaluate the performance of CLIP we demonstrate is efficiency and effectiveness through a series of experiments on synthetic data sets.

  • PDF

Clustering Approach for Topology Control in Multi-Radio Wireless Mesh Networks (Multi-Radio 무선 메쉬 네트워크에서의 토폴로지 제어를 위한 클러스터링 기법)

  • Que, Ma. Victoria;Hwang, Won-Joo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.9
    • /
    • pp.1679-1686
    • /
    • 2007
  • Clustering is a topology control approach often used in wireless ad hoc networks to improve scalability and prolong network lifetime. Furthermore, it is also employed to provide semi-management functionalities and capacity enhancement. The usage of clustering topology control technique can also be applied to multi-radio wireless mesh network. This would utilize the advantages of the multi-radio implementation in the network. The aggregation would result to a more stable, connected, scalable and energy-efficient network. On this paper, we design a clustering algorithm for multi-radio wireless mesh network that would use these advantages and would take into consideration both mobility and heterogeneity of the network entities. We also show that the algorithm terminates at a definite time t and the message control overhead complexity is of constant order of O(1) per node.