• 제목/요약/키워드: Graph Clustering

검색결과 133건 처리시간 0.034초

CLUSTERING DNA MICROARRAY DATA BY STOCHASTIC ALGORITHM

  • Shon, Ho-Sun;Kim, Sun-Shin;Wang, Ling;Ryu, Keun-Ho
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2007년도 Proceedings of ISRS 2007
    • /
    • pp.438-441
    • /
    • 2007
  • Recently, due to molecular biology and engineering technology, DNA microarray makes people watch thousands of genes and the state of variation from the tissue samples of living body. With DNA Microarray, it is possible to construct a genetic group that has similar expression patterns and grasp the progress and variation of gene. This paper practices Cluster Analysis which purposes the discovery of biological subgroup or class by using gene expression information. Hence, the purpose of this paper is to predict a new class which is unknown, open leukaemia data are used for the experiment, and MCL (Markov CLustering) algorithm is applied as an analysis method. The MCL algorithm is based on probability and graph flow theory. MCL simulates random walks on a graph using Markov matrices to determine the transition probabilities among nodes of the graph. If you look at closely to the method, first, MCL algorithm should be applied after getting the distance by using Euclidean distance, then inflation and diagonal factors which are tuning modulus should be tuned, and finally the threshold using the average of each column should be gotten to distinguish one class from another class. Our method has improved the accuracy through using the threshold, namely the average of each column. Our experimental result shows about 70% of accuracy in average compared to the class that is known before. Also, for the comparison evaluation to other algorithm, the proposed method compared to and analyzed SOM (Self-Organizing Map) clustering algorithm which is divided into neural network and hierarchical clustering. The method shows the better result when compared to hierarchical clustering. In further study, it should be studied whether there will be a similar result when the parameter of inflation gotten from our experiment is applied to other gene expression data. We are also trying to make a systematic method to improve the accuracy by regulating the factors mentioned above.

  • PDF

Gated Multi-channel Network Embedding for Large-scale Mobile App Clustering

  • Yeo-Chan Yoon;Soo Kyun Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권6호
    • /
    • pp.1620-1634
    • /
    • 2023
  • This paper studies the task of embedding nodes with multiple graphs representing multiple information channels, which is useful in a large volume of network clustering tasks. By learning a node using multiple graphs, various characteristics of the node can be represented and embedded stably. Existing studies using multi-channel networks have been conducted by integrating heterogeneous graphs or limiting common nodes appearing in multiple graphs to have similar embeddings. Although these methods effectively represent nodes, it also has limitations by assuming that all networks provide the same amount of information. This paper proposes a method to overcome these limitations; The proposed method gives different weights according to the source graph when embedding nodes; the characteristics of the graph with more important information can be reflected more in the node. To this end, a novel method incorporating a multi-channel gate layer is proposed to weigh more important channels and ignore unnecessary data to embed a node with multiple graphs. Empirical experiments demonstrate the effectiveness of the proposed multi-channel-based embedding methods.

MMR, 클러스터링, 완전연결기법을 이용한 요약방법 비교 (Comparisons of MMR, Clustering and Perfect Link Graph Summarization Methods)

  • 유준현;변동률;박순철
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 Ⅲ
    • /
    • pp.1319-1322
    • /
    • 2003
  • We present a web document summarizer, simpler more condense than the existing ones, of a search engine. This summarizer generates summaries with a statistic-based summarization method using Clustering or MMR technique to reduce redundancy in the results, and that generates summaries using Perfect Link Graph. We compare the results with the summaries generated by human subjects. For the comparison, we use FScore. Our experimental results verify the accuracy of the summarization methods.

  • PDF

지능형 교통 시스템을 위한 Graph Neural Networks 기반 교통 속도 예측 (Traffic Speed Prediction Based on Graph Neural Networks for Intelligent Transportation System)

  • 김성훈;박종혁;최예림
    • 한국ITS학회 논문지
    • /
    • 제20권1호
    • /
    • pp.70-85
    • /
    • 2021
  • 최근 활발히 연구되는 딥러닝 방법론은 인공지능의 성능을 급속도로 향상시켰고, 이에 따라 다양한 산업 분야에서 딥러닝을 활용한 시스템이 제시되고 있다. 교통 시스템에서는 GNN을 활용한 공간-시간 그래프 모델링이 교통 속도 예측에 효과적인 것으로 밝혀졌지만, 이는 메모리 병목 현상을 유발하기 때문에 모델이 비효율적으로 학습된다는 단점이 있다. 따라서 본 연구에서는 그래프 분할 방법을 통해 도로 네트워크를 분할하여 메모리 병목 현상을 완화함과 동시에 우수한 성능을 달성하고자 한다. 제안 방법론을 검증하기 위해 인천시 UTIC 데이터 분석 결과를 바탕으로 Jensen-Shannon divergence를 사용하여 도로 속도 분포의 유사도를 측정하였다. 그리고 측정된 유사도를 바탕으로 스펙트럴 클러스터링을 수행하여 도로 네트워크를 군집화하였다. 성능 측정 결과, 도로 네트워크가 7개의 네트워크로 분할되었을 때 MAE 기준 5.52km/h의 오차로 비교 모델 대비 가장 우수한 정확도를 보임과 동시에 메모리 병목 현상 또한 완화되는 것을 확인할 수 있었다.

Spectral clustering based on the local similarity measure of shared neighbors

  • Cao, Zongqi;Chen, Hongjia;Wang, Xiang
    • ETRI Journal
    • /
    • 제44권5호
    • /
    • pp.769-779
    • /
    • 2022
  • Spectral clustering has become a typical and efficient clustering method used in a variety of applications. The critical step of spectral clustering is the similarity measurement, which largely determines the performance of the spectral clustering method. In this paper, we propose a novel spectral clustering algorithm based on the local similarity measure of shared neighbors. This similarity measurement exploits the local density information between data points based on the weight of the shared neighbors in a directed k-nearest neighbor graph with only one parameter k, that is, the number of nearest neighbors. Numerical experiments on synthetic and real-world datasets demonstrate that our proposed algorithm outperforms other existing spectral clustering algorithms in terms of the clustering performance measured via the normalized mutual information, clustering accuracy, and F-measure. As an example, the proposed method can provide an improvement of 15.82% in the clustering performance for the Soybean dataset.

효율적인 개념 클러스터링 기법 (An Efficient Conceptual Clustering Scheme)

  • 양기철
    • 한국엔터테인먼트산업학회논문지
    • /
    • 제14권4호
    • /
    • pp.349-354
    • /
    • 2020
  • 본 논문에서는 개체를 자유롭게 설명하고 효율적으로 클러스터링을 수행 할 수 있는 개념 그래프 기반의 새로운 클러스터링 체계 Clustering scheme Based on Conceptual graphs(CBC)를 제안한다. 개념적 클러스터링은 기계 학습 기술 중 하나이다. 개념 클러스터링에서 개체 간의 유사성은 개체의 의미나 환경을 고려하지 않고 유사성을 결정하는 일반적인 클러스터링 체계와 달리 개념 구성원의 자격에 따라 결정된다. 이 논문에서는 다양한 개체를 개념 그래프로 자유롭게 설명하여 효율적인 개념 클러스터링을 수행 할 수 있는 새로운 개념 클러스터링 체계인 CBC를 소개한다.

공통 이웃 그래프 밀도를 사용한 소셜 네트워크 분석 (Social Network Analysis using Common Neighborhood Subgraph Density)

  • 강윤섭;최승진
    • 한국정보과학회논문지:컴퓨팅의 실제 및 레터
    • /
    • 제16권4호
    • /
    • pp.432-436
    • /
    • 2010
  • 소셜 네트워크를 비롯한 네트워크로부터 커뮤니티를 발견하려면 네트워크의 노드를 그룹 내에서는 서로 조밀하게 연결되고 그룹 간에는 연결의 밀도가 낮은 그룹들로 군집화하는 과정이 꼭 필요하다. 군집화 알고리즘의 성능을 위해서는 군집화의 기준이 되는 유사도 기준이 잘 정의되어야 한다. 이 논문에서는 네트워크 내의 커뮤니티 발견을 위해 유사도 기준을 정의하고, 정의한 유사도를 유사도 전파(affinity propagation) 알고리즘과 결합하여 만든 방법을 기존의 방법들과 비교한다.

클러스터링을 이용한 경험적 태스크 할당 기법 (A Heuristic Task Allocation Scheme Based on Clustering)

  • 김석일;전중남;김관유
    • 한국정보처리학회논문지
    • /
    • 제6권10호
    • /
    • pp.2659-2669
    • /
    • 1999
  • This paper a heuristic, clustering based task allocation scheme applicable to non-directed task graph on a distributed system. This scheme firstly builds a task-machine graph, and then applies a clustering process where in a pair of tasks that are connected to the highest cost edge is merged into a big one or a task is allocated to a machine. During the process, the proposed scheme figure out a machine onto which the task allocation may cause deduction of large communication overhead that has incurred between the task and tasks that are already allocated to the machine while the computation costs is slightly increased in the machine. Simulation for the various task graphs shows that the scheduling using the proposed scheme result far better than ones by using the traditional schemes. A comparison with optimal task scheduling also promises that our scheme derives optimal results more occasionally than the traditional schemes do.

  • PDF

완전그래프를 이용한 문서요약 연구 (Document Summarization Method using Complete Graph)

  • 유준현;박순철
    • 한국산업정보학회논문지
    • /
    • 제10권2호
    • /
    • pp.26-31
    • /
    • 2005
  • 본 논문에서는 웹 검색엔진에서 일반적으로 사용하는 문서요약에 대한 연구로써 문서 내에 있는 문장들의 꼭짓점을 연결하는 완전그래프기법을 도입하여 요약내용을 좀 더 간결하고 함축하게 하는 통계요약기법을 제안했다. 이 요약기술을 지금까지 통계 문서요약기술에서 우수하다고 판단된 클러스터링 기법과 MMR 기법 등과 비교하였다. 특히, 요약 성능을 평가하기 위하여 인위적으로 요약된 요약문을 기준으로 한 각 요약기법들의 FScore값들과 비교하였다. 이 기술들 중에서 완전그래프기법이 약 $30\%$정도 성능향상을 보였다.

  • PDF

A Horizontal Partition of the Object-Oriented Database for Efficient Clustering

  • Chung, Chin-Wan;Kim, Chang-Ryong;Lee, Ju-Hong
    • Journal of Electrical Engineering and information Science
    • /
    • 제1권1호
    • /
    • pp.164-172
    • /
    • 1996
  • The partitioning of related objects should be performed before clustering for an efficient access in object-oriented databases. In this paper, a horizontal partition of related objects in object-oriented databases is presented. All subclass nodes in a class inheritance hierarchy of a schema graph are shrunk to a class node in the graph that is called condensed schema graph because the aggregation hierarchy has more influence on the partition than the class inheritance hierarchy. A set function and an accessibility function are defined to find a maximal subset of related objects among the set of objects in a class. A set function maps a subset of the domain class objects to a subset of the range class objects. An accessibility function maps a subset of the objects of a class into a subset of the objects of the same class through a composition of set functions. The algorithm derived in this paper is to find the related objects of a condensed schema graph using accessibility functions and set functions. The existence of a maximal subset of the related objects in a class is proved to show the validity of the partition algorithm using the accessibility function.

  • PDF