• Title/Summary/Keyword: 분산 그래프

Search Result 165, Processing Time 0.05 seconds

A Distributed Vertex Rearrangement Algorithm for Compressing and Mining Big Graphs (대용량 그래프 압축과 마이닝을 위한 그래프 정점 재배치 분산 알고리즘)

  • Park, Namyong;Park, Chiwan;Kang, U
    • Journal of KIISE
    • /
    • v.43 no.10
    • /
    • pp.1131-1143
    • /
    • 2016
  • How can we effectively compress big graphs composed of billions of edges? By concentrating non-zeros in the adjacency matrix through vertex rearrangement, we can compress big graphs more efficiently. Also, we can boost the performance of several graph mining algorithms such as PageRank. SlashBurn is a state-of-the-art vertex rearrangement method. It processes real-world graphs effectively by utilizing the power-law characteristic of the real-world networks. However, the original SlashBurn algorithm displays a noticeable slowdown for large-scale graphs, and cannot be used at all when graphs are too large to fit in a single machine since it is designed to run on a single machine. In this paper, we propose a distributed SlashBurn algorithm to overcome these limitations. Distributed SlashBurn processes big graphs much faster than the original SlashBurn algorithm does. In addition, it scales up well by performing the large-scale vertex rearrangement process in a distributed fashion. In our experiments using real-world big graphs, the proposed distributed SlashBurn algorithm was found to run more than 45 times faster than the single machine counterpart, and process graphs that are 16 times bigger compared to the original method.

A Distributed Path-Finding Algorithm for Distributed Metabolic Pathways (분산된 대사경로네트워크에 대한 경로검색을 위한 분산알고리즘)

  • Lee, Sun-A;Lee, Keon-Myung;Lee, Seung-Joo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.4
    • /
    • pp.425-430
    • /
    • 2005
  • Many problems can be formulated in terms nf graphs and thus solved by graph-theoretic algorithms. This paper is concerned with finding paths between nodes over the distributed and overlapped graphs. The proposed method allows multiple agents to cooperate to find paths without merging the distributed graphs. For each graph there is a designated agent which is charged of providing path-finding service for hot graph and initiating the path-finding tasks of which path starts from the graph. The proposed method earlier on constructs an abstract graph so-called viewgraph for the distributed overlapped graphs and thus enables to extract the information about how to guide the path finding over the graphs. The viewgraph is shared by all agents which determine how to coordinate other agents for the purpose of finding paths. Each agent maintains the shortest path information among the nodes which are placed in different overlapped subgraphs of her graph. Once an agent is asked to get a path from a node on her graph to another node on another's graph, she directs other agents to provide the necessary information for finding paths.

Survey on Distributed Graph Processing Systems (분산 그래프 처리 시스템에 대한 연구 조사)

  • Ko, Seongyun;Seo, In;Shin, Hyungyu;Lee, Jinsoo;Han, Wook-Shin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.58-59
    • /
    • 2017
  • 그래프 데이터는 객체와 객체들 간의 관계를 모델링하여 사회 관계망 서비스, 사물 인터넷 그리고 뇌 네트워크등의 데이터를 표현하며 저장한다. 빅데이터의 시대에 빅 그래프를 처리하기 위한 수요는 가파르게 증가하고 있다. 분산 그래프 처리 시스템은 매우 큰 그래프 데이터를 클러스터 내의 여러 머신의 메모리에 나누어 저장함으로써, 빅 그래프의 처리를 가능하게 하였다. 본 논문에서는 최신 분산 그래프 처리 시스템들의 특징들을 비교 연구한다.

Learning Distribution Graphs Using a Neuro-Fuzzy Network for Naive Bayesian Classifier (퍼지신경망을 사용한 네이브 베이지안 분류기의 분산 그래프 학습)

  • Tian, Xue-Wei;Lim, Joon S.
    • Journal of Digital Convergence
    • /
    • v.11 no.11
    • /
    • pp.409-414
    • /
    • 2013
  • Naive Bayesian classifiers are a powerful and well-known type of classifiers that can be easily induced from a dataset of sample cases. However, the strong conditional independence assumptions can sometimes lead to weak classification performance. Normally, naive Bayesian classifiers use Gaussian distributions to handle continuous attributes and to represent the likelihood of the features conditioned on the classes. The probability density of attributes, however, is not always well fitted by a Gaussian distribution. Another eminent type of classifier is the neuro-fuzzy classifier, which can learn fuzzy rules and fuzzy sets using supervised learning. Since there are specific structural similarities between a neuro-fuzzy classifier and a naive Bayesian classifier, the purpose of this study is to apply learning distribution graphs constructed by a neuro-fuzzy network to naive Bayesian classifiers. We compare the Gaussian distribution graphs with the fuzzy distribution graphs for the naive Bayesian classifier. We applied these two types of distribution graphs to classify leukemia and colon DNA microarray data sets. The results demonstrate that a naive Bayesian classifier with fuzzy distribution graphs is more reliable than that with Gaussian distribution graphs.

A Study on Graph Partitioning for Graph Query Processing in Distributed System (분산 환경에서 그래프 질의 수행을 위한 그래프 분할 기법 조사)

  • Lee, Wonseok;Ko, Seoungyun;Seo, Myeongwon;Lee, Jeong-Hoon;Han, Wook-Shin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.734-736
    • /
    • 2019
  • 그래프 분할 기법은 분산 환경에서 그래프 질의 수행에 있어 통신 비용을 줄이고 부하 균형을 맞추고자 그래프의 정점과 간선들을 여러 머신들에 나누어 저장하는 방법이다. 본 논문에서는 그래프 질의 수행에 관한 지식을 정리하고, 간선 절단 기법(edge-cut), 정점 절단 기법(vertex-cut), 하이브리드 절단 기법(hybrid-cut)으로 알려진 대표적인 그래프 분할 기법과 최신 그래프 시스템들의 그래프 분할 기법을 소개하고 비교한다.

Distributed Algorithm to search paths in distributed metabolic pathway networks (분산된 대사 네트워크에 대한 경로탐색을 위한 분산 알고리즘)

  • Lee Sun-a;Lee Keon-Myoung
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2005.04a
    • /
    • pp.349-352
    • /
    • 2005
  • 이 논문에서는 분산된 생물학의 대사 네트워크들이 있을 때, 이를 통합하지 않은 상태에서 경로검색을 하는 분산 알고리즘을 제안한다. 대사 네트워크는 여러 데이터베이스에 존재하며 서로 중복되는 데이터를 가지고 있다. 제안한 방법은 네트워크 사이의 중첩이 있는 부분을 하이퍼 노드로 하고, 네트워크 자체는 하이퍼 에지로 하는 추상 하이퍼 그래프를 만들어서, 이를 이용한 상위수준의 경로를 구축한다. 각 네트워크내의 중첩된 영역간의 경로를 미리 계산해 둔 다음, 상위수준의 경로에 기반하여 분산된 대사네트워크 간에 존재하는 경로를 검색한다. 추상 하이퍼 그래프는 데이터베이스를 하이퍼 노드로 하는 것에 대한 경로탐색을 한 다음, 그 경로에 따라 데이터베이스 내에 존재하는 대사경로를 탐색한다. 이때 존재하는 대사경로가 많기 때문에 각각의 대사경로를 하이퍼 노드로 하는 추상 하이퍼 그래프를 만들어 경로를 탐색하고 나서 그 하위 노드에 대해 경로탐색을 한다. 이는 분산된 네트워크를 통합할 저장 공간 및 탐색시간을 줄일 수 있다는 장점이 있다.

  • PDF

Experimental Evaluation of PageRank/BFS Queries on Distributed Graph Processing Systems (최신 분산 그래프 처리 시스템에서의 PageRank/BFS 질의 처리 성능 평가)

  • Lee, Kyeong-Jun;Kim, Hyeonji;Lee, Yukyoung;Lee, Juneyoung;Kim, Kangsu;Han, Wook-Shin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.826-828
    • /
    • 2017
  • 그래프는 객체와 객체 간의 관계를 표현하는 데에 있어 효과적인 데이터 표현 방법이다. 그래프 데이터는 웹 그래프, 사회 관계망 서비스, 신약 개발, 생명정보학 등의 다양한 분야에서 활용되고 있으며, 그래프 마이닝 응용에서 활용되기 위한 효율적인 처리 기술을 필요로 한다. 최근까지 그래프 데이터의 처리 및 분석을 위한 많은 시스템들이 개발되었다. 본 논문에서는 최신 분산 그래프 처리 시스템 중에서 대표적인 그래프 분석 질의인 페이지랭크(pagerank)와 너비 우선 탐색(breadth first search)를 수행하고 시스템의 성능을 평가한다.

Improvement on The Complexity of Distributed Depth First Search Protocol (분산깊이 우선 탐색 프로토콜의 복잡도 개선을 위한 연구)

  • Choe, Jong-Won
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.4
    • /
    • pp.926-937
    • /
    • 1996
  • A graph traversal technique is a certain pattern of visiting nodes of a graph. Many special traversal techniques have been applied to solve graph related problems. For example, the depth first search technique has been used for finding strongly onnected components of a directed graph or biconnected components of a general graph. The distributed protocol to implement his depth first search technique on the distributed network can be divided into a fixed topology problem where there is no topological change and a dynamic topology problem which has some topological changes. Therefore, in this paper, we present a more efficient distributed depth first search protocol with fixed topology and a resilient distributed depth first search protocol where there are topological changes for the distributed network. Also, we analysed the message and time complexity of the presented protocols and showed the improved results than the complexities of the other distributed depth first search protocols.

  • PDF

Subgraph Searching Scheme Based on Path Queries in Distributed Environments (분산 환경에서 경로 질의 기반 서브 그래프 탐색 기법)

  • Kim, Minyoung;Choi, Dojin;Park, Jaeyeol;Kim, Yeondong;Lim, Jongtae;Bok, Kyoungsoo;Choi, Han Suk;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.1
    • /
    • pp.141-151
    • /
    • 2019
  • A network of graph data structure is used in many applications to represent interactions between entities. Recently, as the size of the network to be processed due to the development of the big data technology is getting larger, it becomes more difficult to handle it in one server, and thus the necessity of distributed processing is also increasing. In this paper, we propose a distributed processing system for efficiently performing subgraph and stores. To reduce unnecessary searches, we use statistical information of the data to determine the search order through probabilistic scoring. Since the relationship between the vertex and the degree of the graph network may show different characteristics depending on the type of data, the search order is determined by calculating a score to reduce unnecessary search through a different scoring method for a graph having various distribution characteristics. The graph is sequentially searched in the distributed servers according to the determined order. In order to demonstrate the superiority of the proposed method, performance comparison with the existing method was performed. As a result, the search time is improved by about 3 ~ 10% compared with the existing method.

Dynamic Block Reassignment for Load Balancing of Block Centric Graph Processing Systems (블록 중심 그래프 처리 시스템의 부하 분산을 위한 동적 블록 재배치 기법)

  • Kim, Yewon;Bae, Minho;Oh, Sangyoon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.5
    • /
    • pp.177-188
    • /
    • 2018
  • The scale of graph data has been increased rapidly because of the growth of mobile Internet applications and the proliferation of social network services. This brings upon the imminent necessity of efficient distributed and parallel graph processing approach since the size of these large-scale graphs are easily over a capacity of a single machine. Currently, there are two popular parallel graph processing approaches, vertex-centric graph processing and block centric processing. While a vertex-centric graph processing approach can easily be applied to the parallel processing system, a block-centric graph processing approach is proposed to compensate the drawbacks of the vertex-centric approach. In these systems, the initial quality of graph partition affects to the overall performance significantly. However, it is a very difficult problem to divide the graph into optimal states at the initial phase. Thus, several dynamic load balancing techniques have been studied that suggest the progressive partitioning during the graph processing time. In this paper, we present a load balancing algorithms for the block-centric graph processing approach where most of dynamic load balancing techniques are focused on vertex-centric systems. Our proposed algorithm focus on an improvement of the graph partition quality by dynamically reassigning blocks in runtime, and suggests block split strategy for escaping local optimum solution.