Browse > Article
http://dx.doi.org/10.5392/JKCA.2022.22.01.035

In-memory Compression Scheme Based on Incremental Frequent Patterns for Graph Streams  

Lee, Hyeon-Byeong (충북대학교 정보통신공학과)
Shin, Bo-Kyoung (충북대학교 정보통신공학과)
Bok, Kyoung-Soo (원광대학교 SW 융합학과)
Yoo, Jae-Soo (충북대학교 정보통신공학과)
Publication Information
Abstract
Recently, with the development of network technologies, as IoT and social network service applications have been actively used, a lot of graph stream data is being generated. In this paper, we propose a graph compression scheme that considers the stream graph environment by applying graph mining to the existing compression technique, which has been focused on compression rate and runtime. In this paper, we proposed Incremental frequent pattern based compression technique for graph streams. Since the proposed scheme keeps only the latest reference patterns, it increases the storage utilization and improves the query processing time. In order to show the superiority of the proposed scheme, various performance evaluations are performed in terms of compression rate and processing time compared to the existing method. The proposed scheme is faster than existing similar scheme when the number of duplicated data is large.
Keywords
BigData; Graph Stream; Graph Compression; Frequent Pattern; Provenance;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 N. Shah, D. Koutra, T. Zou, B. Gallagher, and C. Faloutsos, "TimeCrunch: Interpretable dynamic graph summarization," in Proc. 21th ACMSIGKDD Int. Conf. Knowl. Discovery Data Mining, pp.10551064, 2015.
2 J. H. Lee and F. Liu "An Efficient Graph Compressor Based on Adaptive Prefix Encoding," SSDBM '19, pp.85-96, 2019.
3 W. Henecka and M. Roughan, "Lossy compression of dynamic, weighted graphs," in Proc. 3rd Int. Conf. Future Internet Things Cloud, pp.427434, Aug. 2015.
4 J. Han, J, Pei, and Y. Yin, "Mining Frequent Patterns without Candidate Generation," in Proc. ACM SIGMOD Int. Conf. Manage. Data, pp.1-12, 2000.
5 C Borgelt, "An implementation of the FP-growth algorithm," Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations, 2005.
6 G. Csardi and T. Nepusz, "The igraph software package for complex network research," InterJournal, complex systems, Vol.1695, No.5, pp.1-9, 2005.
7 M. ZARROUK, "Frequent Patterns mining in time-sensitive Data Stream," International Journal of Computer Science Issues, Vol.9, No.4, pp.1467-1470, 2012.
8 복경수, 한지은, 노연우, 육미선, 임종태, 이석희, 유재수, "RDF 그래프 패턴을 고려한 프로버넌스 압축 기법," 한국콘텐츠학회논문지, 제16권, 제2호, pp.374-386, 2016.   DOI
9 L. Dhulipala, I. Kabiljo, G. Ottaviano, S. Pupyrev, and A. Shalita, "Compressing graphs and indexes with recursive graph bisection," in Proc.22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp.15351544, 2016.
10 Y. Lim, U. Kang, and C. Faloutsos, "SlashBurn: Graph compression and mining beyond caveman communities," IEEE Trans. Knowl. Data Eng., Vol.26, No.12, pp.30773089, Dec. 2014.
11 Sebastian Maneth and Fabian Peternek, "Applying Grammar-Based Compression to RDF," ESWC 2021, pp.93-108, 2021.
12 C. Giannella, J. Han, J. Pei, X. Yan, and P. S. Yu, "Mining frequent patterns in data streams at multiple time granularities," Next generation data mining, Vol.212, pp.191-212, 2003.
13 정재윤, 서인덕, 송희섭, 박재열, 김민영, 최도진, 복경수, 유재수, "그래프 스트림에서 슬라이딩 윈도우 기반의 점진적 빈발 패턴 검출 기법," 한국콘텐츠학회논문지, 제18권, 제2호, pp.147-157, 2018.   DOI
14 B. Dolgorsuren, K. Khan, M. K. Rasel, and Y. Lee, "StarZIP: Streaming Graph Compression Technique for Data Archiving," IEEE Access, pp.38020-38034, 2019.   DOI
15 P. Charles and H. B. Lawrence, "GraphZip: Mining graph streams using dictionary-based compression," in Proc. SIGKDD Workshop Mining Learn. Graphs, 2017.
16 S. Maneth and F. Peternek, "Grammar-based graph compression," Information Systems, Vol.76, pp.19-45, 2019.
17 R. A. Rossi and R. Zhou, "GraphZIP: A clique-based sparse graph compression method," J. Big Data, Vol.5, No.1, pp.1-14, 2018.   DOI
18 K. S. Bok, J. E. Han, J. T. Lim, and J. S. Yoo. "Provenance compression scheme based on graph patterns for large RDF documents," The Journal of Supercomputing, Vol.76, No.8, pp.6376-6398, 2020.   DOI