Browse > Article
http://dx.doi.org/10.3745/KTSDE.2014.3.8.293

Structural Change Detection Technique for RDF Data in MapReduce  

Lee, Taewhi (한국전자통신연구원 빅데이터SW플랫폼연구부)
Im, Dong-Hyuk (호서대학교 컴퓨터정보공학부)
Publication Information
KIPS Transactions on Software and Data Engineering / v.3, no.8, 2014 , pp. 293-298 More about this Journal
Abstract
Detecting and understanding the changes between RDF data is crucial in the evolutionary process, synchronization system, and versioning system on the web of data. However, current researches on detecting changes still remain unsatisfactory in that they did neither consider the large scale of RDF data nor accurately produce the RDF deltas. In this paper, we propose a scalable and effective change detection using a MapReduce framework which has been used in many fields to process and analyze large volumes of data. In particular, we focus on the structure-based change detection that adopts a strategy for the comparison of blank nodes in RDF data. To achieve this, we employ a method which is composed of two MapReduce jobs. First job partitions the triples with blank nodes by grouping each triple with the same blank node ID and then computes the incoming path to the blank node. Second job partitions the triples with the same path and matchs blank nodes with the Hungarian method. In experiments, we show that our approach is more accurate and effective than the previous approach.
Keywords
RDF Change Detection; MapReduce Framework; Hadoop; Blank Node Matching; Large Scale Data;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 G. Klyne and J. J. Carroll, "Resource Description Framework (RDF): Concepts and Abstract", http://www.w3.org/TR/rdf11-concepts/, 2004.
2 D. H. Lee, D. H. Im, H. J. Kim, "A Change Detection Technique Supporting Nested Blank Nodes of RDF Document," Journal of KIISE : Database, Vol.34, No.6, pp. 518-527, 2007.
3 T. Berners-Lee and D. Connolly, "Delta: An Ontology for the Distribution of Differences between RDF Graphs," http://w3.org/DesignIssues/Diff
4 D. Zeginis Y. Tzitzikas, and V. Christophides, "On Computing Deltas of RDF/S Knowledge Bases," ACM Transactions on the Web(TWEB), 2011.
5 D. H. Im, S. W. Lee, and H. J. Kim, "A version management framework for RDF triple stores," International Journal of Software Engineering and Knowledge Engineering, Vol.22, No.1, pp.85-106, 2012.   DOI
6 J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," in Proceedings of the 6th USENIX Symposium on Operating Systems Design and Implementation, pp.137-150, 2004.
7 J. H. Ahn, D. H. Im, J. W. Jung, N. Zong, K. S. Ha, H. G. Kim, "Design and implementation of change detection for Linked Data using mapreduce framework," HCI KOREA, 2013.
8 Y. Tzitzikas, C. Lantzaki, and D. Zeginis, "Blank Node Matching and RDF/S Comparison Functions," in Proceedings of the 11th International Semantic Web Conference(ISWC'12), 2012.
9 D. H. Im, S. W. Lee, and H. J. Kim, "Backward inference and pruning for RDF change detection using RDBMS," Journal of Information Science, Vol.39, No.2, pp.238-255, 2013.   DOI
10 J. Myung, J. Yeon, and S. Lee, "SPARQL basic graph pattern processing with iterative MapReduce," in Proceedings of MDAC, pp.6:1-6:6, 2010.
11 DBLP computer science bibliography. http://www.informatik.uni-trier.de/-ley/db/
12 M. Husain et al., "Heuristics based Query Processing for Large RDF Graphs using Cloud Computing," IEEE TKDE, Vol.23, No.9, pp.1322-1327, 2011.
13 J. Urbani et al, "Webpie: A web-scale parallen inference engine using mapreduce", Journal of Web Semantics, Vol. 10, pp.59-75, 2012.   DOI   ScienceOn
14 M. Schmidt et al, "SP2Bench: A SPARQL Performance Benchmark," in Proceedings of ICDE, pp.222-233, 2009.