A Change Detection Technique Supporting Nested Blank Nodes of RDF Documents

내포된 공노드를 포함하는 RDF 문서의 변경 탐지 기법

  • 이동희 (서울대학교 전기컴퓨터공학부) ;
  • 임동혁 (서울대학교 전기컴퓨터공학부) ;
  • 김형주 (서울대학교 전기컴퓨터공학부)
  • Published : 2007.12.15

Abstract

It is an important issue to find out the difference between RDF documents, because RDF documents are changed frequently. When RDF documents contain blank nodes, we need a matching technique for blank nodes in the change detection. Blank nodes have a nested form and they are used in most RDF documents. A RDF document can be modeled as a graph and it will contain many subtrees. We can consider a change detection problem as a minimum cost tree matching problem. In this paper, we propose a change detection technique for RDF documents using the labeling scheme for blank nodes. We also propose a method for improving the efficiency of general triple matching, which used predicate grouping and partitioning. In experiments, we showed that our approach was more accurate and faster than the previous approaches.

RDF 문서들은 빈번히 갱신이 발생하므로 RDF 문서간의 변경부분을 찾아내는 것은 중요한 관심사가 된다. RDF 문서 내에 공노드가 존재할 경우 변경부분을 탐지해내려면 공노드간의 매칭을 지원하는 기법이 필요하다. RDF 문서에서 공노드는 내포된 형태로 존재하며 실제 사용되는 RDF 문서 대부분이 공노드를 포함하고 있다. RDF 문서를 그래프로 모델링하면 하나의 문서는 여러 개의 트리로 나누어진다. 따라서 문서간의 변경탐지는 동일한 루트를 가지는 트리간의 최소 비용 매칭 문제로 생각할 수 있다. 본 논문에서는 공노드에 대한 레이블링 기법을 기용하여 내포된 공노드를 포함한 RDF문서의 변경탐지 기법을 제안한다. 또한 공노드가 아닌 일반 트리플들의 비교에 있어서도 효율성을 높이는 술어 그룹화와 분할 기법을 제안한다. 실험을 통해 제안한 기법이 기존의 방법보다 더 정확하며 효율적임을 보였다.

Keywords

References

  1. Ora Lassila, Ralph R. Swick, eds. Resource Description Framework (RDF) Model and Syntax Specification. http://www.w3.org/TR/1999/REC-rdfsyntax-19990222/
  2. Broekstra, J. Kampman, A. van Harmelen, F., 'Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema,' In Proceedings of the International Semantic Web Conference, 2002
  3. Wilkinson, K. Sayers, C. Kuno, H. Reynolds, D., 'Efficient RDF storage and retrieval in Jena2,' In First International Workshop on Semantic Web and Databases, 2003
  4. Gene Ontology Consortium, http://www.geneontology.org
  5. UniProt, Uniprot: The Universal Protein Resource, http://www.pir.uniprot.org/
  6. Cobena, G., Abiteboul, S., and Marian, A., 'Detecting changes in XML documents,' In Proceedings of the International Conference on Data Engineering, 2002
  7. GNU Diff, http://www.gnu.org/software/diffutils/
  8. Wang, Y., DeWitt, D. J., Cai, J.-Y., 'X-Diff: An effective change detection algorithm for XML documents,' In 19th International Conference on Data Engineering, 2003
  9. Berners-Lee T., Connolly, D., 'Delta: An Ontology for the Distribution of Differences Between RDF Graphs,' http://www.w3.org/DesignIssues/Diff
  10. Carroll, J. J., 'Signing RDF Graphs,' In Proceedings of the International Semantic Web Conference, 2003
  11. Shelley Powers, Practical RDF, 1st Ed., p.43, O'Reilly & Associates, 2003
  12. Ognyanov, D., Kirakov, A., 'Tracking Changes in RDF(S) Repositories,' In the Proceedings of 13th International Conference on Knowledge Engineering and Knowledge Management, 2002
  13. Harold W. Kuhn, 'The Hungarian Method for the assignment problem,' Naval Research Logistic Quarterly, Vol. 2, pp. 83-97, 1955 https://doi.org/10.1002/nav.3800020109
  14. James Munkres, 'Algorithms for the Assignment and Transportation Problems,' Journal of the Society of Industrial and Applied Mathematics, Vol. 5, No. 1, pp. 32-38, 1957 https://doi.org/10.1137/0105003
  15. Hungarian algorithm, http://en.wikipedia.org/wiki/Hungarian_algorithm#Algorithm
  16. Kanehisa, M., Goto, S., 'KEGG: Kyoto Encyclopedia of Genes and Genomes,' Nucleic Acids Research, 2000, Vol. 28, No. 1, pp. 27-30, http://www.genome.jp/kegg/ https://doi.org/10.1093/nar/28.1.27
  17. W3C; Eric Prud'hommeaux, KEGG RDF Mapping, http://www.w3.org/2005/02/13-KEGG/
  18. UniProt RDF, http://expasy3.isb-sib.ch/~ejain/rdf/