Browse > Article
http://dx.doi.org/10.5392/JKCA.2019.19.01.141

Subgraph Searching Scheme Based on Path Queries in Distributed Environments  

Kim, Minyoung (충북대학교 정보통신공학과)
Choi, Dojin (충북대학교 정보통신공학과)
Park, Jaeyeol (충북대학교 정보통신공학과)
Kim, Yeondong (충북대학교 정보통신공학과)
Lim, Jongtae (충북대학교 정보통신공학과)
Bok, Kyoungsoo (충북대학교 정보통신공학과)
Choi, Han Suk (목포대학교 컴퓨터공학과)
Yoo, Jaesoo (충북대학교 정보통신공학과)
Publication Information
Abstract
A network of graph data structure is used in many applications to represent interactions between entities. Recently, as the size of the network to be processed due to the development of the big data technology is getting larger, it becomes more difficult to handle it in one server, and thus the necessity of distributed processing is also increasing. In this paper, we propose a distributed processing system for efficiently performing subgraph and stores. To reduce unnecessary searches, we use statistical information of the data to determine the search order through probabilistic scoring. Since the relationship between the vertex and the degree of the graph network may show different characteristics depending on the type of data, the search order is determined by calculating a score to reduce unnecessary search through a different scoring method for a graph having various distribution characteristics. The graph is sequentially searched in the distributed servers according to the determined order. In order to demonstrate the superiority of the proposed method, performance comparison with the existing method was performed. As a result, the search time is improved by about 3 ~ 10% compared with the existing method.
Keywords
Graph Data; Graph Search; Distributed Processing; Subgraph Matching; Bigdata;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. Cuzzocrea, F. Furfaro, G. M. Mazzeo, and D. Sacca, "A grid framework for approximate aggregate query answering on summarized sensor network readings," Proc. OTM Workshops, pp.144-153, 2004.
2 A. Fariha, C. F. Ahmed, C. K. Leung, S. M. Abdullah, and L. Cao, "Mining frequent patterns from human interactions in meetings using directed acyclic graphs," Proc. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, pp.38-49, 2013.
3 F. Jiang and C. K. Leung, "Mining interesting "following" patterns from social networks," Proc. International Conference on Data Warehousing and Knowledge Discovery, pp.308-319, 2014.
4 F. Towards, "Towards a Scalable HDFS Architecture," Proc. International Conference on Collaboration Technologies and Systems, pp.155-161, 2013.
5 J. Dorre, S. Apel, and C. Lengauer, "Modeling and optimizing MapReduce programs," Concurrency and Computation: Practice and Experience, Vol.27, No.7, pp.1734-1766, 2015.   DOI
6 A. Alam and J. Ahmed, "Hadoop Architecture and Its Issues," Proc. International Conference on Computational Science and Computational Intelligence, pp.288-291, 2014.
7 J. Cheng, Y. Ke, and W. Ng, "Efficient query processing on graph databases," ACM Transactions on Database Systems, Vol.34, No.1, pp.1-48, 2009.
8 X. Liao, Z. Gao, W. Ji, and Y. Wang, "An enforcement of real time scheduling in Spark Streaming," Proc. International Green and Sustainable Computing Conference, pp.1-6, 2015.
9 N. Talukder, and M. J. Zaki, "A distributed approach for graph mining in massive networks," Data Mining and Knowledge Discovery, Vol.30, No.5, pp.1024-1052, 2016.   DOI
10 Y, Tian, R. C. McEachin, C. Santos, D. J. States, and J. M. Patel, "SAGA: a subgraph matching tool for biological graphs," Bioinformatics, Vol.23, No.2, pp.232-239, 2007.   DOI
11 S. Khuller, B. Raghavachari, and N. E. Young, "Balancing minimum spanning trees and shortest-path trees," Algorithmica, Vol.14, No.4, pp.305-321, 1995.   DOI
12 M. Faloutsos, P. Faloutsos, and C. Faloutsos, "On power-law relationships of the internet topology," ACM SIGCOMM 1999 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, pp.251-262, 1999.
13 J. Balaji and R. Sunderraman, "Distributed Graph Path Queries Using Spark," Proc. COMPSAC Workshops, pp.326-331, 2016.
14 X. Zhang and L. Chen, "Distance-aware selective online query processing over large distributed graphs," Data Science and Engineering, Vol.2, No.1, pp.2-21, 2017.   DOI
15 N. Jing, Y. Huang, and E. A. Rundensteiner, "Hierarchical encoded path views for path query processing: An optimal model and its performance evaluation," IEEE Transactions on Knowledge and Data Engineering, Vol.10, No.3, pp.409-432, 1998.   DOI
16 M. L. Goldstein, S. A. Morris, and G. G. Yen, "Problems with fitting to the power-law distribution," The European Physical Journal B-Condensed Matter and Complex Systems, Vol.41, No.2, pp.255-258, 2004.   DOI