Browse > Article
http://dx.doi.org/10.5392/JKCA.2020.20.08.042

Approximate Top-k Subgraph Matching Scheme Considering Data Reuse in Large Graph Stream Environments  

Choi, Do-Jin (충북대학교 정보통신공학과)
Bok, Kyoung-Soo (원광대학교 SW 융합학과)
Yoo, Jae-Soo (충북대학교 정보통신공학과)
Publication Information
Abstract
With the development of social network services, graph structures have been utilized to represent relationships among objects in various applications. Recently, a demand of subgraph matching in real-time graph streams has been increased. Therefore, an efficient approximate Top-k subgraph matching scheme for low latency in real-time graph streams is required. In this paper, we propose an approximate Top-k subgraph matching scheme considering data reuse in graph stream environments. The proposed scheme utilizes the distributed stream processing platform, called Storm to handle a large amount of stream data. We also utilize an existing data reuse scheme to decrease stream processing costs. We propose a distance based summary indexing technique to generate Top-k subgraph matching results. The proposed summary indexing technique costs very low since it only stores distances among vertices that are selected in advance. Finally, we provide k subgraph matching results to users by performing an approximate Top-k matching on the summary indexing. In order to show the superiority of the proposed scheme, we conduct various performance evaluations in diverse real world datasets.
Keywords
Stream Graph; Top-k Query Processing; Subgraph Mathcing; Approximate Query Processing; Distributed Stream Processing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. Ding, J. Li, and H. Liu, "Interactive Anomaly Detection on Attributed Networks," Proc. International Conference on Web Search and Data Mining, pp.357-365, 2019.
2 H. Yin, A. R. Benson, and J. Leskovec, "The Local Closure Coefficient: a New Perspective on Network Clustering," Proc. International Conference on Web Search and Data Mining, pp.303-311, 2019.
3 M. Henzinger, S. Krinninger, and D. Nanongkai, "A Deterministic Almost-Tight Distributed Algorithm for Approximating Single-Source Shortest Paths," Proc. Annual ACM SIGACT Symposium on Theory of Computing, pp.489-498, 2016.
4 G. B. Mertzios, H. Molter, and V. Zamaraev, "Sliding Window Temporal Graph Coloring," Proc. of the AAAI Conference on Artificial Intelligence, Vol.33, pp.7667-7674, 2019.
5 J. Gao, C. Zhou, J. Zhou, and J. X. Yu, "Continuous Pattern Detection over Billion-edge Graph using Distributed Framework," Proc. International Conference on Data Engineering, pp.556-567, 2014.
6 K. Kim, L. Seo, W. S. Han, J. H. Lee, S. Hong, H. Chafi, and G. Jeong, "TurboFlux: A Fast Continuous Subgraph Matching System for Streaming Graph Data," Proc. ACM SIGMOD Conference, pp.411-426, 2018.
7 W. Chen, J. Liu, Z. Chen, X. Tang, and K. Li, "PBSM: An Efficient Top-K Subgraph Matching Algorithm," International Journal of Pattern Recognition and Artificial Intelligence, Vol.32, No.6, pp.1-23, 2018.
8 R. Pienta, A. Tamersoy, H. Tong, and D. H. Chau, "Mage: Matching Approximate Patterns in Richly-Attributed Graphs," Proc. International Conference on Big Data, pp.585-590, 2014.
9 X. Shan, C. Jia, L. Ding, X. Ding, and B. Song, "Dynamic Top-K Interesting Subgraph Query on Large-Scale Labeled Graphs," Information, Vol.10, No.2, p.61, 2019.   DOI
10 Y. Tian and J. M. Patel, "Tale: A Tool for Approximate Large Graph Matching," Proc. International Conference on Data Engineering, pp.963-972, 2008.
11 S. Sethi and A. Dixit, "A Novel Page Ranking Mechanism based on User Browsing Patterns," Software Engineering, Springer, Singapore, pp.37-49, 2019.
12 L. Zou, L. Chen, and Y. Lu, "Top-k Subgraph Matching Query in a Large Graph," Proc. ACM first Ph. D. workshop in CIKM, pp.139-146, 2007.
13 L. Lai, Z. Qing, Z. Yang, X. Jin, Z. Lai, R. Wang, and Y. Zhang, "Distributed Subgraph Matching on Timely Dataflow," Proceedings of the VLDB Endowment, Vol.12, No.10, pp.1099-1112, 2019.
14 Q. Zhang, D. Guo, X. Zhao, and A. Guo, "On Continuously Matching of Evolving Graph Patterns," Proc. International Conference on Information and Knowledge Management, pp.2237-2240, 2019.
15 Y. Li, L. Zou, M. T. Ozsu, and D. Zhao, "Time Constrained Continuous Subgraph Search over Streaming Graphs," Proc. International Conference on Data Engineering, pp.1082-1093, 2019.
16 https://storm.apache.org/
17 최도진, 복경수, 유재수, "데이터 재사용을 고려한 효율적인 연속 서브 그래프 매칭 기법," 정보과학회논문지, 제46권, 제8호, pp.842-851, 2019.
18 https://zookeeper.apache.org/
19 R. Rossi and N. Ahmed, "The Network Data Repository with Interactive Graph Analytics and Visualization," Proc. AAAI Conference on Artificial Intelligence, pp.4292-4293, 2015.
20 A. Mislove, H. S. Koppula, K. P. Gummadi, P. Druschel, and B. Bhattacharjee, "Growth of the Flickr Social Network," Proc. ACM first workshop on Online social networks, pp.25-30, 2008.