• Title/Summary/Keyword: 클리스터링

Search Result 9, Processing Time 0.023 seconds

An Unsupervised Clustering Technique of XML Documents based on Function Transform and FFT (함수 변환과 FFT에 기반한 조정자가 없는 XML 문서 클러스터링 기법)

  • Lee, Ho-Suk
    • The KIPS Transactions:PartD
    • /
    • v.14D no.2
    • /
    • pp.169-180
    • /
    • 2007
  • This paper discusses a new unsupervised XML document clustering technique based on the function transform and FFT(Fast Fourier Transform). An XML document is transformed into a discrete function based on the hierarchical nesting structure of the elements. The discrete function is, then, transformed into vectors using FFT. The vectors of two documents are compared using a weighted Euclidean distance metric. If the comparison is lower than the pre specified threshold, the two documents are considered similar in the structure and are grouped into the same cluster. XML clustering can be useful for the storage and searching of XML documents. The experiments were conducted with 800 synthetic documents and also with 520 real documents. The experiments showed that the function transform and FFT are effective for the incremental and unsupervised clustering of XML documents similar in structure.

An Efficient Clustering Algorithm Considering Node Density in Wireless Sensor Networks (무선 센서 네트워크에서 노드 밀도를 고려한 효율적인 클러스터링 알고리즘)

  • Kim, Chang-Hyeon;Kim, Kun-Woo;Lee, Won-Joo;Jeon, Chang-Ho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2009.01a
    • /
    • pp.301-304
    • /
    • 2009
  • 무선 센서 네트워크는 제한된 에너지로 동작하는 다수의 센서 노드로 구성되기 때문에 효율적으로 에너지를 사용 것이 중요하다. 기존의 클러스터 기반 알고리즘은 지역적으로 인접한 다수의 노드들을 클러스터로 구성하고 멤버 노드로부터 수신된 데이터를 병합하여 전송함으로써 에너지 소모를 줄였다. 하지만 클러스터링 과정에서 노드의 밀도를 고려하지 않았기 때문에 불균등한 노드 분포상에서 데이터 병합의 효과를 얻을 수 없는 클러스터를 생성할 수 있다. 따라서 본 논문에서는 클러스터링과정에 노드의 밀도를 고려하여 데이터 병합 효과를 최대화하고, 에너지 소모를 줄일 수 있는 새로운 클러스터링 알고리즘을 제안한다.

  • PDF

Creation of Frequent Patterns using Clustering in Large Database (대용량 데이터베이스에서 클러스터링을 이용한 빈발 패턴 생성)

  • Kim, Eui-Chan;Hwang, Byung-Yeon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2005.11b
    • /
    • pp.100-102
    • /
    • 2005
  • 데이터베이스에 저장되어 있는 데이터들을 통해서 의미있는 정보를 찾는 것이 데이터 마이닝이다. 많은 데이터 마이닝 기법들 중에 연관규칙을 다루는 연구가 많이 이루어지고 있다. 연관규칙 기법도 다양하게 연구되고 있는데 그 중 빈발 패턴 트리(FP-Tree)라는 방법을 이용하여 빈발 패턴을 찾아내는 연구가 활발히 진행되고 있다. 빈발 패턴 트리는 기존에 잘 알려져있는 연관규칙 생성 기법인 Apriori 기법보다 우수한 성능을 가지는 방법이다. 그러나 빈발 패턴 트리도 몇가지 문제점을 가지고 있다. 본 논문에서는 빈발 패턴 트리의 문제점 중 하나인 과도한 FP-Tree 생성을 줄이려 한다. 조건부 패턴 베이스를 통해 얻어지는 조건부 FP-Tree의 생성을 줄여 기존의 FP-Tree보다 더 나은 성능을 얻기 위해서 적절한 클리스터링을 이용하려 한다. 클러스터링 기법은 비트 트랜잭션을 이용한 클러스터링 방법을 이용한다.

  • PDF

Scene Change Detection with 3-Step Process (3단계 과정의 장면 전환검출)

  • Yoon, Shin-Seong;Won, Rhee-Yang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.6
    • /
    • pp.147-154
    • /
    • 2008
  • First, this paper compute difference value between frames using the composed method of $X^2$ histogram and color histogram and the normalization. Next, cluster representative frame was decided by using the clustering for distance and the k-mean grouping. Finally, representative frame of group was decided by using the likelihood ratio. Proposed method can be known by experiment as outstanding of detection rather than other methods, due to computing of difference value, clustering and grouping, and detecting of representative frame.

  • PDF

A Study on the Choice of Fuzzy Rule Genetic Algorithm Using Similarity Check Method (유사성 체크 방법을 이용한 Fuzzy Rule선택 Genetic Algorithm에 관한 연구)

  • Kang, Jeon-Geun;Kim, Myeong-Soon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.11a
    • /
    • pp.731-734
    • /
    • 2017
  • GA(Genetic Algorithm)는 자연계 진화 과정의 적자생존의 유전적 부호화 및 처리과정을 모델링함으로서 해석적으로 처리하기 힘든 문제의 최적화에 널리 이용하고 있으며, 퍼지제어에서 룰의 선택에도 적용된다. 본 논문에서는 일반적인 GA방법에 자료의 유사성을 체크하는 방법을 도입하여 Fuzzy Rule선택 환경에 적용하고 시뮬레이션을 통해 이를 확인한다. 시뮬레이션 결과 제안된 SFRGA(Similarity Fuzzy Rule Genetic Algorithm)방법은 일반적 GA방법보다 단축된 지연시간 효과와 부수적으로 조기포화 현상(premature convergence)의 감소 및 자동 배정 퍼지 클리스터링(Fuzzy clustering)의 가능성을 얻을 수 있었다.

Efficient Schemes for Scaling Ring Bandwidth in Ring-based Multiprocessor System (링 구조 다중프로세서 시스템에서 링 대역폭 확장을 위한 효율적인 방안)

  • Jang, Byoung-Soon;Chung, Sung-Woo;Jhang, Seong-Tae;Jhon, Chu-Shik
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.2
    • /
    • pp.177-187
    • /
    • 2000
  • In the past several years, many systems which adopted ring topology with high-speed unidirectional point-to-point links have emerged to overcome the limit of bus for interconnection network of clustered multiprocessor system. However, rapid increase of processor speed and performance improvement of local bus and memory system limit scalability of system with point-to-point link of standard bandwidth. Therefore, necessity to extend bandwidth is emphasized. In this paper, we adopt PANDA system as base model, which is clustering-based multiprocessor system. By simulating a model adopting commercial processor and local bus specification, we show that point-to-point link is bottleneck of system performance, and bandwidth expansion by more than 200% is needed. To expand bandwidth of interconnection network, it needs excessive design cost and time to develop new point-to-point link with doubled bandwidth. As an alternative to double bandwidth, we propose several ways to implement dual ring -simple dual ring, transaction-separated dual ring, direction-separated dual ring- by using off-the-shelf point-to-point links with IEEE standard bandwidth. We analyze pros. and cons. of each model compared with doubled-bandwidth single ring by simulation.

  • PDF

Patent Search System Using IPC Clustering (국제특허분류 클러스터링을 이용한 특허 검색 시스템)

  • Kim, Han-Gi;Lee, Seok-Hyoung;Yoon, Hwa-Mook
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.103-106
    • /
    • 2007
  • The importance of intellectual property right becomes larger and the number of the person who uses a patent search is increasing. When considering the search pattern of the general user who uses only one or two search terms, it is not easy task to find desirable search result in the massive patent documents. So we present patent search system based on IPC Clustering which helps users confine the search result by using international patent classification (IPC) which provided from all patent documents. By using this system, the general users can find patent search result more effectively.

  • PDF

Generation of Decision Rules Bsed on Concept Ascension and Optimal Reduction of Attributes (개념 상승과 속성의 최적 감축에 의한 결정 규칙의 생성)

  • 정환묵
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.9 no.4
    • /
    • pp.367-374
    • /
    • 1999
  • This paper suggests an integrated method based on concept ascension and attribute reduction for efficient induction of decision rules from a large database. We study an automatic scheme to generate concept trees by a clustering technique, a method for generalizing databases by the concept ascension technique, an optimal reduction method by means of attributes reduction using the sibmificance of attributes, and an efficient way of reduction of attribute values applying the discernible matrix and functions. The method can be used for the decision making tasks such as an investment planning or price evaluation, the construction of knowledge bases for diagnosis of defects or medical diagnosis, data analysis such as marketing or experimental data, information retrieval for high level inquiries, and so on.

  • PDF

Partially Evaluated Genetic Algorithm based on Fuzzy Clustering (퍼지 클러스터링 기반의 국소평가 유전자 알고리즘)

  • Yoo Si-Ho;Cho Sung-Bae
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.9
    • /
    • pp.1246-1257
    • /
    • 2004
  • To find an optimal solution with genetic algorithm, it is desirable to maintain the population sire as large as possible. In some cases, however, the cost to evaluate each individual is relatively high and it is difficult to maintain large population. To solve this problem we propose a novel genetic algorithm based on fuzzy clustering, which considerably reduces evaluation number without any significant loss of its performance by evaluating only one representative for each cluster. The fitness values of other individuals are estimated from the representative fitness values indirectly. We have used fuzzy c-means algorithm and distributed the fitness using membership matrix, since it is hard to distribute precise fitness values by hard clustering method to individuals which belong to multiple groups. Nine benchmark functions have been investigated and the results are compared to six hard clustering algorithms with Euclidean distance and Pearson correlation coefficients as fitness distribution method.