DOI QR코드

DOI QR Code

RDBMS Based Efficient Method for Shortest Path Searching Over Large Graphs Using K-degree Index Table

대용량 그래프에서 k-차수 인덱스 테이블을 이용한 RDBMS 기반의 효율적인 최단 경로 탐색 기법

  • 홍지혜 (경희대학교 컴퓨터공학과) ;
  • 한용구 (경희대학교 컴퓨터공학과) ;
  • 이영구 (경희대학교 컴퓨터공학과)
  • Received : 2014.01.27
  • Accepted : 2014.02.28
  • Published : 2014.05.31

Abstract

Current networks such as social network, web page link, traffic network are big data which have the large numbers of nodes and edges. Many applications such as social network services and navigation systems use these networks. Since big networks are not fit into the memory, existing in-memory based analysis techniques cannot provide high performance. Frontier-Expansion-Merge (FEM) framework for graph search operations using three corresponding operators in the relational database (RDB) context. FEM exploits an index table that stores pre-computed partial paths for efficient shortest path discovery. However, the index table of FEM has low hit ratio because the indices are determined by distances of indices rather than the possibility of containing a shortest path. In this paper, we propose an method that construct index table using high degree nodes having high hit ratio for efficient shortest path discovery. We experimentally verify that our index technique can support shortest path discovery efficiently in real-world datasets.

소셜 네트워크, 웹 페이지 링크, 교통 네트워크 등과 같은 최근의 네트워크들은 노드와 에지의 수가 방대한 빅 데이터이다. 소셜 네트워크 서비스나 네비게이션 서비스와 같이 이와 같은 네트워크를 이용하는 애플리케이션이 많아지고 있다. 대용량 네트워크는 전체를 메모리에 적재할 수 없어, 기존의 네트워크 분석 기술을 활용할 수 없다. 최근 대용량 그래프의 효율적 탐색을 제공하는 RDB 기반 연산자들이 프레임워크(Frontier-expand-merge framework, FEM)로 제안되었다. FEM은 효율적인 최단 경로 탐색을 위해 부분 최단 경로를 저장하는 RDB 기반의 인덱스 테이블을 구축하였다. 그러나 FEM의 인덱스 테이블은 최단 경로에 포함될 확률보다 인덱스의 거리에 의해 결정되기 때문에 인덱스 테이블 참조율이 떨어진다. 본 논문에서는 효율적인 최단 경로 탐색을 지원하는 인덱스 참조율이 높은 차수가 큰 노드들을 이용한 인덱스 테이블 구축 기법을 제안한다. 실험을 통하여 제안하는 인덱스 테이블 구축 기법이 실세계 데이터 셋에서 효율적인 최단 경로 탐색을 지원함을 보인다.

Keywords

References

  1. E. Dijkstra. A note on two problems in connexion with graphs. Numerische Mathematik, 1, 1, 1959.
  2. D. Wagner and T. Willhalm. Speed-up techniques for shortest-path computations. Proceedings of the 24th International Symposium on Theoretical Aspects of Computer Science, February 22-24, Aachen, Germany, 2007.
  3. A. Goldberg and C. Harrelson. Computing the shortest path:search meets graph theory. Proceedings of the 16th annual ACM-SIAM symposium on discrete algorithms SODA, January 23-25, Vancouver, British Columbia, 2005.
  4. F. Wei. Tedi: efficient shortest path query answering on graphs. Proceedings of the 29th ACM SIGMOD International Conference on Management of Data, June 6-11, Indianapolis, USA, 2010.
  5. D. Johnson and L. McGeoch. The traveling salesman problem: A case study in local optimization. Local search in combinatorial optimization, 1995.
  6. R. Prim. Shortest connection networks and some generalizations. Bell System Technical Journal, 36, 6, 1957.
  7. Y. Yuan, G. Wang, H. Wang and L. Chen. Efficient subgraph search over large uncertain graphs. PVLDB, 4, 11, 2011.
  8. J. Dean and S. Ghemawat. Mapreduce: Simplified data processing on large clusters. Proceedings of the 6th symposium on Operating Systems Design & Implementation, December 6-8, San Francisco, CA, 2004.
  9. B. Bahmani, K. Chakrabarti, and D. Xin. Fast personalized pagerank on mapreduce. Proceedings of the 30th ACM SIGMOD International Conference on Management of Data, June 12-16, Athens, Greece, 2011.
  10. C. Wang, W. Wang, J. Pei, Y. Zhu, and B. Shi. Scalable mining of large disk-based graph databases. In SIGKDD, pp.316-325, 2004.
  11. R. H. Mohring, H. Schilling, B. Schutz, D. Wagner and T. Willhalm, "Partitioning Graphs to Speed Up Dijkstra's Algorithm," In: Nikoletseas, S.E. WEA. LNCS, Vol.3503, pp. 189-202, 2005.
  12. J. Gao, R. Jin, J. Zhou, J. Yu, X. Jiang and T. Wang, "Relational Approach for Shortest Path Discovery over Large Graphs," In: PVLDB, 5(4), pp.358-369, 2011.
  13. S. Padmanabhan and S. Chakravarthy. HDB-Subdue: A Scalable Approach to Graph Mining. Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery, August 31-September 2, Linz, Austria, 2009.
  14. S. Chakravarthy and S. Pradhan. DB-FSG: An SQL-based approach for frequent subgraph mining. Proceedings of the 19th International Conference on Database and Expert Systems Applications, September 1-5, Turin, Italy, 2008.
  15. A. V. Goldberg and C. Harrelson. Computing the shortest path: A* search meets graph theory. In SODA, pp.156-165, 2005.
  16. A. B. David, M. Kamesh, "A graph-theoretic analysis of the human protein-interaction network using multicore parallel algorithms," Proc. 6th Workshop on HiCOMB, 2007.
  17. A. De. Montis, S. Caschili, "Nuraghes and landscape planning: Coupling viewshed with complex network analysis," In: Landscape and Urban Planning, Vol.105, Issue 3, pp.315-324, 2012. https://doi.org/10.1016/j.landurbplan.2012.01.005
  18. J. Yang and J. Leskovec. Defining and Evaluating Network Communities based on Ground-truth. Proceedings of the 18th ACM SIGKDD Workshop on Mining Data Semantics, August 12-16, Beijing, Chinam, 2012.
  19. J. Leskovec, K. Lang, A. Dasgupta and M. Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics, 6, 1, 2009. https://doi.org/10.1080/15427951.2009.10129180
  20. M. E. J. Newman. The structure of scientific collaboration networks. National Academy of Sciences, 98, 2, 2001.