• Title/Summary/Keyword: 계층구조 클러스터링

Search Result 97, Processing Time 0.033 seconds

A Multidimensional Nested-Attribute Indexing for Queries on Nested Objects (중포된 객체에 대한 질의처리를 위한 다차원 중포 속성 색인기법)

  • 이종학;대구효
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10a
    • /
    • pp.352-354
    • /
    • 1999
  • 본 논문에서는 객체지향 데이터베이스의 중포된 객체에 대한 질의처리를 효율적으로 지원하기 위한 다차원 중포 속성 색인기법을 제안한다. 중포된 객체에 대한 기존의 색인기법들은 일차원 색인구조를 이용함으로써 중포된 객체의 속성과 클래스 계층이 포함된 다양한 형태의 질의들에 대한 처리를 효율적으로 지원하지 못하는 문제점을 가지고 있다. 다차원 중포 속성 색인기법에서는 다차원 파일구조를 이용하여 중포 속성의 킷값 도메인과 함께 중포 속성을 표현하는 경로상의 모든 속성에 대해 각 속성이 정의된 클래스 계층마다 클래스 식별자 도메인을 할당함으로써, 다차원 도메인 공간상에서 색인 엔트리들의 클러스터링을 다른다. 따라서, 다차원 중포속성 색인기법에서는 기존의 색인기법에서 지원하기 어려운 질의의 대상 범위가 클래스 계층상의 임의의 클래스들로 제한되거나, 질의에 포함된 복합 속성들의 도메인이 클래스 계층상의 임의의 클래스들로 제한되는 경우에도 효율적으로 지원할 수 있다.

  • PDF

A Study on Weighted Hierarchical Color Clustering Using Color Distribution (컬러 분포를 가중치로 이용한 컬러 클러스터링에 관한 연구)

  • 윤위영;범수균;탁우현;이종환;김경석
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1998.10b
    • /
    • pp.250-252
    • /
    • 1998
  • 내용기반 이미지 검색(Content-based image retrieval)에서 컬러 특징을 표현하기 위해 컬러 히스토그램이 많이 이용되고 있다. 하지만 히스토그램의 고차원적인 성질 때문에 색인구조를 사용한 효율적인 검색이 어렵고, 유사도 계산 단계에서 비용이 많이 든다. 이점을 개선하기 위해서 이미지의 컬러 정보 손실을 최소화하면서 히스토그램의 차원을 낮추는 컬러 클러스터링 방법이 제안되었다. 이 논문은 이미지 검색의 응용 분야에 따른 이미지 데이터의 컬러 분포 특성을 이용한 컬러 클러스터링 방법을 제안한다. 컬러 분포를 가중치로 이용한 계층적 컬러 클러스터링 방법에 대해 알아보고, 두 단계 컬러 히스토그램을 이용한 이미지 검색에 적용하여 컬러 정보 유지 능력을 실험해 본다.

Comparison of Software Clustering using Split Based Tree Analysis (분기점 기반 트리 분석을 통한 소프트웨어 클러스터링 결과 비교)

  • Um, Jaechul;Lee, Chan-gun
    • Journal of Software Engineering Society
    • /
    • v.25 no.3
    • /
    • pp.59-62
    • /
    • 2012
  • We propose a novel metric for quantitatively comparing different clustered results generated from software clustering algorithms. A quantitative evaluation of software clustering helps understanding of architectural changes of software. The concept of split, which has been used for analysis of genetic characters in bio-informatics, is applied in the analysis of software architecture.

  • PDF

Clustering Scheme using Memory Restriction for Wireless Sensor Network (무선센서네트워크에서 메모리 속성을 이용한 클러스터링 기법)

  • Choi, Hae-Won;Yoo, Kee-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.1B
    • /
    • pp.10-15
    • /
    • 2009
  • Recently, there are tendency that wireless sensor network is one of the important techniques for the future IT industry and thereby application areas in it are getting growing. Researches based on the hierarchical network topology are evaluated in good at energy efficiency in related protocols for wireless sensor network. LEACH is the best well known routing protocol for the hierarchical topology. However, there are problems in the range of message broadcasting, which should be expand into the overall network coverage, in LEACH related protocols. Thereby, this paper proposes a new clustering scheme to solve the co-shared problems in them. The basic idea of our scheme is using the inherent memory restrictions in sensor nodes. The results show that the proposed scheme could support the load balancing by distributing the clusters with a reasonable number of member nodes and thereby the network life time would be extended in about 1.8 times longer than LEACH.

Web Document Clustering Using Statistical Techniques & Tag Information on the Specific-Domain Web site (전문 웹 사이트에서의 통계적 기법과 태그 정보를 이용한 문서 분류)

  • 조은휘;변영태
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.297-302
    • /
    • 2002
  • 특정 영역에 대해 사용자에게 관련 정보를 제공하는 서비스를 위해 정보 에이전트를 개발하고 있다. 이 시스템은 웹 상에서 문서를 수집해 오는데 특정 영역과 관련한 지식베이스를 토대로 하고 있는데, 이들 중 몇몇 전문 사이트 내의 정보가 많이 포함되어 있음을 볼 수 있다. 그러므로 전문 사이트 내의 관련 문서 수집은 중요한 의의가 있다. 본 논문에서는 이들 전문 사이트 내의 전문 문서 수집을 위해 문서간의 유사성을 토대로 클러스터링 한다. 즉, 문서내의 텀(term)과 HTML 태그(tag), 지식베이스의 WordNet 계층구조를 data로 하고 SVD(Singular Value Decomposition)을 사용하여 문서간의 관계를 밝혀내었다.

  • PDF

Realignment of Clients in Client-server Database System (클라이언트-서버 데이터베이스에서 의 온라인 클라이언트 재배치)

  • Park, Young-B.;Park, J.
    • The KIPS Transactions:PartD
    • /
    • v.10D no.4
    • /
    • pp.639-646
    • /
    • 2003
  • Conventional two-tier databases have shown performance limitation in the presence of many concurrent clients. To this end, the three-tier architecture that exploits similarities in client's object access behavior has been proposed. In this system, clients are partitioned into clusters, and object requests can be then served in inter-cluster manner. Introducing an intermediate layer between server(s) and clients enables this. In this paper, we introduce the problem of client realignment in which access behavior changes, and propose on-line client clustering. This system facilitates adaptive reconfiguration and redistribution of sites. The core issue in this paper is to demonstrate the effectiveness of on-line client clustering. We experimentally investigate the performance of the scheme and necessary costs.

An Efficient Algorithm for Clustering XML Schemas (XML 스키마 클러스터링을 위한 효율적인 알고리즘)

  • Rhim Tae-Woo;Lee Kyong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.7
    • /
    • pp.857-868
    • /
    • 2005
  • Schema clustering is important as a prerequisite to the integration of XML schemas. This paper presents an efficient method for clustering XML schemas. The proposed method first computes similarities among schemas. The similarity is defined by the size of the common structure between two schemas under the assumption that the schemas with less cost to be integrated are more similar. Specifically, we extract one-to-one matchings between paths with the largest number of corresponding elements. Finally, a hierarchical clustering method is applied to the value of similarity. Experimental results with many XML schemas show that the method has peformed better compared with previous works, resulting in a Precision of $99\%$ and a rate of clustering of $93\%$ in average.

  • PDF

The Clustering Scheme for Load-Balancing in Mobile Ad-hoc Network (이동 애드혹 네트워크에서 로드 밸런싱을 위한 클러스터링 기법)

  • Lim, Won-Taek;Kim, Gu-Su;Kim, Moon-Jeong;Eom, Young-Ik
    • The KIPS Transactions:PartC
    • /
    • v.13C no.6 s.109
    • /
    • pp.757-766
    • /
    • 2006
  • Mobile Ad-hoc Network(MANET) is an autonomous network consisted of mobile hosts. A considerable number of studies have been conducted on the MANET with studies of ubiquitous computing. Several studies have been made on the clustering schemes which manage network hierarchically to Improve flat architecture of MANET. But the conventional schemes have the lack of multi-hop clustering and load balancing. This paper proposes a clustering scheme to support multi-hop clustering and to consider load balancing between cluster heads. We define the split of clusters and states of cluster, and propose join, merge, divide, and election of cluster head schemes for load balancing of between cluster heads

A Technique of Cluster Detection to Self-Organized Network (자율 군집 네트워크에서 군집 탐지 기법)

  • Kim, Paul;Kim, Kyungdeok;Kim, Sangwook
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.115-118
    • /
    • 2012
  • 다양한 네트워크에서 군집을 분석하고 그 구조를 발견하는 것은 그 네트워크의 복잡도를 낮추어 전체 시스템을 이해하고 관리하는데 중요하다. 특히 기본적인 컴퓨팅이 가능한 여러 기기들이 자율적으로 서로 통신하여 군집을 이루는 자율 군집 네트워크에서 군집을 정확하게 발견하는 것은 집단행동 서비스를 실현하는데 있어서 중요한 기술이다. 따라서 본 연구에서는 자율 군집 네트워크에서 군집 탐지 기법을 제안한다. 제안하는 기법은 군집을 발견하고 그 군집을 식별하기 위해 해당 네트워크에서 한 노드를 공유하는 두 개의 간선 쌍에 대해 계층 군집화를 수행하고 계층 간에 간선 유사도를 계산하여 비교한다. 계층 군집화를 통한 간선들은 트리 구조로 표현할 수 있으며 최적의 분할 밀도를 이용하여 노드들을 클러스터링한 후 최종 군집으로 분리 한다.

Hierarchical Overlapping Clustering to Detect Complex Concepts (중복을 허용한 계층적 클러스터링에 의한 복합 개념 탐지 방법)

  • Hong, Su-Jeong;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.111-125
    • /
    • 2011
  • Clustering is a process of grouping similar or relevant documents into a cluster and assigning a meaningful concept to the cluster. By this process, clustering facilitates fast and correct search for the relevant documents by narrowing down the range of searching only to the collection of documents belonging to related clusters. For effective clustering, techniques are required for identifying similar documents and grouping them into a cluster, and discovering a concept that is most relevant to the cluster. One of the problems often appearing in this context is the detection of a complex concept that overlaps with several simple concepts at the same hierarchical level. Previous clustering methods were unable to identify and represent a complex concept that belongs to several different clusters at the same level in the concept hierarchy, and also could not validate the semantic hierarchical relationship between a complex concept and each of simple concepts. In order to solve these problems, this paper proposes a new clustering method that identifies and represents complex concepts efficiently. We developed the Hierarchical Overlapping Clustering (HOC) algorithm that modified the traditional Agglomerative Hierarchical Clustering algorithm to allow overlapped clusters at the same level in the concept hierarchy. The HOC algorithm represents the clustering result not by a tree but by a lattice to detect complex concepts. We developed a system that employs the HOC algorithm to carry out the goal of complex concept detection. This system operates in three phases; 1) the preprocessing of documents, 2) the clustering using the HOC algorithm, and 3) the validation of semantic hierarchical relationships among the concepts in the lattice obtained as a result of clustering. The preprocessing phase represents the documents as x-y coordinate values in a 2-dimensional space by considering the weights of terms appearing in the documents. First, it goes through some refinement process by applying stopwords removal and stemming to extract index terms. Then, each index term is assigned a TF-IDF weight value and the x-y coordinate value for each document is determined by combining the TF-IDF values of the terms in it. The clustering phase uses the HOC algorithm in which the similarity between the documents is calculated by applying the Euclidean distance method. Initially, a cluster is generated for each document by grouping those documents that are closest to it. Then, the distance between any two clusters is measured, grouping the closest clusters as a new cluster. This process is repeated until the root cluster is generated. In the validation phase, the feature selection method is applied to validate the appropriateness of the cluster concepts built by the HOC algorithm to see if they have meaningful hierarchical relationships. Feature selection is a method of extracting key features from a document by identifying and assigning weight values to important and representative terms in the document. In order to correctly select key features, a method is needed to determine how each term contributes to the class of the document. Among several methods achieving this goal, this paper adopted the $x^2$�� statistics, which measures the dependency degree of a term t to a class c, and represents the relationship between t and c by a numerical value. To demonstrate the effectiveness of the HOC algorithm, a series of performance evaluation is carried out by using a well-known Reuter-21578 news collection. The result of performance evaluation showed that the HOC algorithm greatly contributes to detecting and producing complex concepts by generating the concept hierarchy in a lattice structure.