DOI QR코드

DOI QR Code

클라우드에서 SPARQL 질의 처리를 위한 조인 성능 향상

Improving Join Performance for SPARQL Query Processing in the Clouds

  • 최규진 (충남대학교 컴퓨터공학과) ;
  • 손윤희 (충남대학교 컴퓨터공학과) ;
  • 이규철 (충남대학교 컴퓨터공학과)
  • 투고 : 2014.12.29
  • 심사 : 2016.04.21
  • 발행 : 2016.06.15

초록

최근 LOD 데이터의 급격한 증가로 인해 기존의 싱글 머신 시스템을 통한 대량의 LOD 처리는 성능의 한계를 가진다. 이러한 문제를 해결하기 위해 최근 연구들은 분산, 병렬 프레임워크인 맵리듀스를 활용한다. 하지만 맵리듀스를 통해 SPARQL 질의를 처리하기 위해서는 다수의 맵리듀스 잡이 필요하고, 이로 인해 추가적인 비용이 발생하게 된다. 또한, 조인을 위해 불필요한 데이터를 처리해야 하는 문제가 있다. 본 논문에서는 이를 해결하기 위해 SPARQL 질의 처리 시 발생하는 맵리듀스 잡의 개수를 줄이고 Bitmap을 기반으로 조인 인덱스를 작성 후 이용하여 불필요한 데이터 처리를 최소화 하는 방법을 제안한다.

Recently, with the rapid growth of LOD (Linked Open Data) existing methods based on a single machine have limitation in performance. Existing solutions use distributed framework such as Mapreduce in order to improve the performance. However, the MapReduce framework for processing SPARQL queries involves multiple MapReduce jobs and additional costs incurred. In addition, the problem of unnecessary data processing arises. In this study, we proposed a method to reduce the number of MapReduce jobs during SPARQL query processing and join indexes based on Bitmap for minimizing the costs of processing unnecessary data.

키워드

과제정보

연구 과제 주관 기관 : 한국연구재단

참고문헌

  1. W. S. Oh, Trends and Prospectives of Linked Data, TopQuadrant Korea, 2009. (in Korean)
  2. Klyne, Graham, and Jeremy J. Carroll (2004, February 10). Resource description framework (RDF): Concepts and abstract syntax [Online]. Available: https://www.w3.org/TR/2004/REC-rdf-concepts-20040210/
  3. Eric Prud'hommeaux, and Andy Seaborne. (2008, January 15). SPARQL Query Language for RDF [Online]. Available: http://www.w3.org/TR/rdf-sparqlquery/
  4. Anja Jentzsch, Richard Cyganiak and Chris Bizer. (2011, September 19). State of the lod cloud [Online]. Available: http://lod-cloud.net/state/
  5. Dean, Jeffrey, and Sanjay Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM 51.1, pp. 107-113, 2008. https://doi.org/10.1145/1327452.1327492
  6. Apache HBase Team, Apache HBase Reference Guide [Online]. Available: http://hbase.apache.org/
  7. Weiss, Cathrin, Panagiotis Karras, and Abraham Bernstein, "Hexastore: sextuple indexing for semantic web data management," Proceedings of the VLDB Endowment 1.1, pp. 1008-1019, 2008.
  8. Neumann, Thomas, and Gerhard Weikum, "RDF-3X: a RISC-style engine for RDF" Proceedings of the VLDB Endowment 1.1, pp. 647-659, 2008.
  9. Atre, Medha, Jagannathan Srinivasan, and James A. Hendler, "BitMat: A Main-memory Bit Matrix of RDF Triples for Conjunctive Triple Pattern Queries," International Semantic Web Conference (Posters & Demos), 2008.
  10. Erling, Orri, and Ivan Mikhailov, "Virtuoso: RDF support in a native RDBMS," Semantic Web Information Management, pp. 501-519, Springer, 2010.
  11. Carroll, Jeremy J., et al., "Jena: implementing the semantic web recommendations," Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, ACM, 2004.
  12. Kiryakov, Atanas, Damyan Ognyanov, and Dimitar Manov, "OWLIM-a pragmatic semantic repository for OWL," Web Information Systems Engineering-WISE 2005 Workshops, Springer, 2005.
  13. Haque, Albert, and Lynette Perkins, "Distributed RDF Triple Store Using HBase and Hive," University of Texas at Austin, 2012.
  14. Sun, Jianling, and Qiang Jin, "Scalable rdf store based on hbase and mapreduce," 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE), Vol. 1, pp. 633-636 IEEE, 2010.
  15. Husain, Mohammad, et al., "Heuristics-based query processing for large rdf graphs using cloud computing" IEEE Transactions Knowledge and Data Engineering, Vol. 23, No. 9, pp. 1312-1327, 2011. https://doi.org/10.1109/TKDE.2011.103
  16. Schatzle, Alexander, et al., "Cascading Map-Side Joins over HBase for Scalable Join Processing," SSWS+ HPCSW, 2012.
  17. Guo, Yuanbo, Zhengxiang Pan, and Jeff Heflin, "LUBM: A benchmark for OWL knowledge base systems," Web Semantics: Science, Services and Agents on the World Wide Web, Vol. 3, No. 2, pp. 158-182, 2005. https://doi.org/10.1016/j.websem.2005.06.005