스트리밍 XML 데이타를 위한 효율적인 다중 질의 처리 기법

An Efficient Multi-Query Evaluation Technique for Streaming XML Data

  • 발행 : 2007.06.15

초록

현재 스트리밍 XML 데이타에 대한 관심이 점차 증가한다. 스트리밍 XML 데이타에 대한 대부분의 연구는 XML 데이타를 효율적으로 여과하는 기법에 초점을 맞추었다. 이러한 XML 여과 시스템들은 사용자들이 관심 있는 XML 문서 전체를 사용자들에게 제공한다. 이 경우, 제공된 XML 문서들로부터 관심 있는 부분만을 추출하는 부담이 사용자에게 남겨지게 된다. 따라서, 스트리밍 XML 데이타에 대하여 직접적으로 질의 처리를 수행하여 관심 있는 XML 부분만을 추출하는 스트리밍 XML 질의 처리 기법들이 제안되었다. 그러나, 기존의 스트리밍 XML 질의 처리 기법들은 제안된 XPath 질의 만을 지원하며 복수 개의 질의 문을 처리하지는 못하고 있다. 본 논문에서는 스트리밍 데이타의 한 번 읽는 특성에 따라 XML 데이타를 한 번 읽으면서 복수 개의 질의들을 동시에 처리하는 XTREAM을 제안하고자 한다. 또한, XTREAM은 기존의 기법들에 비하여 순서 기반 프리디케이트 등 다양한 종류의 XPath 질의 기능들을 지원한다. 실제 XML 데이타와 합성 XML 데이타를 통한 실험 결과들은 XTREAM의 효율성과 확장성을 보인다.

Recently, there has been growing interest in streaming XML data. Much of the work on streaming XML data has been focused on efficient filtering of XML data. Such XML filtering systems deliver XML documents to interested users. The burden of extracting the XML fragments of interest from XML documents is placed on users. As a result, several evaluation techniques for streaming XML data, which only extract interested XML fragments by directly evaluating XML queries on streaming XML data, have been proposed. However, existing evaluation techniques for streaming XML data only support the restricted subset of XPath queries, and multiple queries cannot be evaluated by such evaluation techniques. In this paper, we propose XTREAM which evaluates multiple queries in conjunction with the read-once nature of streaming data. In contrast to the previous work, XTREAM supports a wide class of XPath queries including order based predicates and so on. Experimental results with real-life and synthetic XML data demonstrate the efficiency and scalability of XTREAM.

키워드

참고문헌

  1. T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler, 'Extensible Markup Language(XML) 1.0, W3C Recommendation,' http://www.w3.org/TR/REC-XML, 1998
  2. M. Altinel and M. J. Franklin, 'Efficient Filtering of XML Documents for Selective Dissemination of Information,' Proc. of 26th International Conference on Very Large Data Bases, pp. 53-64, September 2000
  3. S. Bose and L. Fegaras, 'Data Stream Management for Historical XML Data,' Proc. of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 239-250, June 2004 https://doi.org/10.1145/1007568.1007597
  4. C.-Y. Chan, P. Felber, M. Garofalakis, and R.Rastogi, 'Efficient Filtering of XML Documents with XPath Expressions,' Proc. of the 18th International Conference on Data Engineering, pages 235-244, February 2002
  5. Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. Fischer, 'YFilter: Efficient and Scalable Filtering of XML Documents,' Proc. of the 18th International Conference on Data Engineering, pp. 341-342, February 2002 https://doi.org/10.1109/ICDE.2002.994748
  6. A. K. Gupta and D. Suciu, 'Stream Processing of XPath Queries with Predicates,' Proc. of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 419-430, June 2003 https://doi.org/10.1145/872757.872809
  7. B. He, Q. Luo, and B. Choi, 'Cache-Conscious Automata for XML Filtering,' Proc. of the 21st International Conference on Data Engineering, pp. 878-889, April 2005 https://doi.org/10.1109/ICDE.2005.31
  8. F. Peng and S. S. Chawathe, 'XPath Queries on Stream Data,' Proc. of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 431-442, June 2003 https://doi.org/10.1145/872757.872810
  9. J. Clark and S. DeRose, 'XML Path Language(XPath) Version 1.0, W3C Recommendation,' http://www.w3.org/TR/xpath, November 1999
  10. B. Ludascher, P. Mukhopadhayn, and Y. Papakonstantinou, 'A transducerbased xml query processor,' Proc. of 28th International Conference on Very Large Data Bases, pp. 227-238, October 2002
  11. F. Peng and S. S. Chawathe, 'XSQ: A Streaming XPath Engine,' Technical Report CS-TR4493, University of Maryland, 2003
  12. J. E. Hopcraft and J. D. Ullman, 'Introduction to Automata Theory, Language, and Computation,' Addison-Wesley Publishing Company, Reading, Massachusetts, 1979
  13. I. Avila-Campillo, T. Green, A. Gupta, M. Onizuka, D. Raven, and D. Suciu, 'Xmltk: An xml toolkit for scalable xml stream processing,' Proc. of Programming Language Technologies for XML(PLAN-X), October 2002
  14. F. Bry, F. Coskun, S. Durmaz, T. Furche, D. Olteanu, and M. Spannagel, 'The XML Stream Query Processor SPEX,' Proc. of the 21st International Conference on Data Engineering, pp. 1120-1121, April 2005 https://doi.org/10.1109/ICDE.2005.141
  15. Y. Chen, S. B. Davidson, and Y. Zheng, 'ViteX: a Streaming XPath Processing System,' Proc. of the 21st International Conference on Data Engineering, pp. 1118-1119, April 2005 https://doi.org/10.1109/ICDE.2005.152
  16. N. Bruno, L. Gravano, N. Koudas, and D. Srivastava, 'Navigational- vs. Index-Based XML Multi-Query Processing,' Proc. of the 19th International Conference on Data Engineering, pp. 139-150, February 2003
  17. A. Schmidt, F. Waas, M. L. Kersten, M. J. Carey, I. Manolescu, and R. Busse, 'XMark: A Benchmark for XML Data Management,' Proc. of the 28th International Conference on Very Large Data Bases, pp. 974-985, August 2002
  18. R. Cover, 'The XML Cover Pages,' http://www.oasis-open.org/cover/xml.html, 2001