DOI QR코드

DOI QR Code

Subtree-based XML Storage and XPath Processing

  • Shin, Ki-Hoon (School of Computer Science and Engineering, Chung-Ang University) ;
  • Kang, Hyun-Chul (School of Computer Science and Engineering, Chung-Ang University)
  • Received : 2010.08.27
  • Accepted : 2010.09.07
  • Published : 2010.10.30

Abstract

The state-of-the-art techniques of storing XML data, modeled as an XML tree, are node-based in the sense that they are centered around XML node labeling and the storage unit is an XML node. In this paper, we propose a generalization of such techniques so that the storage unit is an XML subtree that consists of one or more nodes. Despite several advantages with such generalization, a major problem would be inefficiency in XPath processing where the stored subtrees are to be parsed on the fly in order for the nodes inside them to be accessed. We solve this problem, proposing a technique whereby no parsing of the subtrees involved in XPath processing is needed at all unless they contain the nodes of the final query result. We prove that the correctness of XPath processing is guaranteed with our technique. Through implementation and experiments, we also show that the overhead of our technique is acceptable.

Keywords

References

  1. C. Zhang, J. Naughton, D. DeWitt, Q. Luo, and G. Lohman, "On Supporting Containment Queries in Relational Database Management Systems," in Proc. of ACM SIGMOD International Conf. on Management of Data, pp. 425-436, 2001.
  2. S. Al-Khalifa, H. Jagadish, N. Koudas, J. Patel, D. Srivastava, Y. Wu, "Structural Joins: A Primitive for Efficient XML Query Pattern Matching," in Proc. of 18th International Conf. on Data Engineering, pp. 141-152, 2002.
  3. N. Bruno, N. Koudas, and D. Srivastava, "Holistic Twig Joins: Optimal XML Pattern Matching," in Proc. of ACM SIGMOD International Conf. on Management of Data, pp. 310-321, 2002.
  4. Q. Li and B. Moon, "Indexing and Querying XML Data for Regular Path Expressions," in Proc. of 27th International Conf. on Very Large Data Bases, pp. 361-370, 2001.
  5. S. Abiteboul, O. Benjelloun, B. Cautis, I. Manolescu, T. Milo, and N. Preda, "Lazy Evaluation for Active XML," in Proc. of ACM SIGMOD International Conf. on Management of Data, pp. 227-238, 2004.
  6. A. Deutsch, M. Fernandez, D. Suciu, "Storing Semistructured Data with STORED," in Proc. of ACM SIGMOD International Conf. on Management of Data, pp. 431-442, 1999.
  7. M. Klettke and H. Meyer, "XML and Object-Relational Database Systems – Enhancing Structural Mappings based on Statistics," in Proc. of 3rd International Workshop on Web and Databases, pp. 63-68, 2000.
  8. Yoshikawa, T. Amagasa, T. Shimura, S. Uemura, "XRel: A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases," ACM Trans. on Internet Technology, vol. 1, no. 1, pp. 110-141, 2001. https://doi.org/10.1145/383034.383038
  9. J. Clark and S. DeRose, editors, XML Path Language (XPath) version 1.0, W3C Recommendation, Nov. 1999, http://www.w3.org/TR/xpath.
  10. A. Schmidt, F. Wass, M. Kersten, M. Carey, I. Manolescu, and R. Busse, "XMark: A Benchmark for XML Data Management," in Proc. of 28th International Conf. on Very Large Data Bases, pp. 974-985, 2002.
  11. D. Florescu and D. Kossmann, "Storing and Querying XML Data Using an RDBMS," IEEE Data Engineering Bulletin, vol. 22, no. 3, pp. 27-34, 1999.
  12. J. Shanmugasundaram, K. Tufte, C. Zhang, G. He, D. DeWitt, and J. Naughton, "Relational Databases for Querying XML Documents: Limitations and Opportunities," in Proc. of 25th International Conf. on Very Large Data Bases, pp. 302-314, 1999.
  13. P. O'Neil, E. O'Neil, S. Pal, I. Cseri, G. Schaller, and N. Westbury, "ORDPATHs: Insert-Friendly XML Node Labels," in Proc. of ACM SIGMOD International Conf. on Management of Data, pp. 903-908, 2004.
  14. C. Li and T. Ling, "QED: A Novel Quaternary Encoding to Completely Avoid Relabeling in XML Updates," in Proc. of International Conf. on Information and Knowledge Management, pp. 501-508, 2005.
  15. T. Fiebig, S. Helmer, C. Kanne, G. Moerkotte, J. Neumann, R. Schiele, and T. Westmann, "Anatomy of a Native XML Base Management System," VLDB Journal, vol. 11, no. 4, pp. 292-314, 2002. https://doi.org/10.1007/s00778-002-0080-y
  16. K. Beyer, R. Cochrane, V. Josifovski, J. Kleewein, G. Lapis, G. Lohman, R. Lyle, F. Ozcan, H. Pirahesh, N. Seemann, T. Truong, B. Van der Linden, B. Vickery, and C. Zhang, "System RX: One Part Relational, One Part XML," in Proc. of ACM SIGMOD International Conf. on Management of Data, pp. 374-358, 2005.
  17. S. Bose and L. Fegaras, "XFrag: A Query Processing Framework for Fragmented XML Data," in Proc. of 8th International Workshop on Web and Databases, pp. 97-102, 2005.
  18. H. Huo, G. Wang, X. Hui, R. Zhou, B. Ning, and C. Xiao, "Effiecient Query Processing for Streamed XML Fragments," in Proc. of 11th International Conf. on Database Systems for Advanced Applications, pp. 468-482, 2006.
  19. H. Jiang, H. Lu, W. Wang, and J. Yu, "XParent: An Efficient RDBMS-based XML Database System," in Proc. of 18th International Conf. on Data Engineering, pp. 335-336, 2002.
  20. C. Mathis, T. Härder, and K. Schmidt, "Storing and Indexing XML Documents Upside Down," Computer Science - Research and Development, vol. 24, no. 1-2, pp. 51-68, 2009. https://doi.org/10.1007/s00450-009-0056-x
  21. R. Goldman and J. Widom, "DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases," in Proc. of 23rd International Conf. on Very Large Data Bases, pp. 436-445, 1997.
  22. P. Grosso and D. Veillard, editors, XML Fragment Interchange (XFI), W3C Candidate Recommendation, Feb. 2001, http://www.w3.org/TR/xml-fragment.