An Efficient Technique for Evaluating Queries with Multiple Regular Path Expressions

다중 정규 경로 질의 처리를 위한 효율적 기법

  • 정태선 (서울대학교 컴퓨터공학부) ;
  • 김형주 (서울대학교 컴퓨터공학부)
  • Published : 2001.09.01

Abstract

As XML has become an emerging standard for information exchange on the World Wide Web, it has gained attention in database communities to extract information from XML seen as a database model. XML queries are based on regular path queries, which find objects reachable by given regular expressions. To answer many kinds of user queries, it is necessary to evaluate queries that have multiple regular path expressions. However, previous work such as query rewriting and query optimization in the frame work of semistructured data has dealt with a single regular expression. For queries that have multiple regular expressions we suggest a two phase optimizing technique: 1. query rewriting using views by finding the mappings from the view's body to the query's body and 2. for rewritten queries, evaluating each query conjunct and combining them. We show that our rewriting algorithm is sound and our query evaluation technique is more efficient than the previous work on optimizing semistructured queries.

최근에 XML이 웹 상에서 문서 교환의 표준으로 등장하면서 XML로 표현된 데이터에 대한 질의 처리 분야가 주목받고 있다. 이때 XML 질의는 그래프로 표현된 데이터 그래프에서 특정 정규식으로 도달되는 객체를 찾는 정규 경로 질의 (regular path query)를 기반으로 한다. 그런데 사용자의 다양한 형태의 질의를 처리하기 위해서는 질의에 하나 이상의 정규 식을 가지는 질의의 처리가 필요함에도 기존의 연구 즉, 비정형 데이터 모델 하에서의 부를 이용한 질의 변환(query rewriting)이나, 질의 최적화 기법에서는 주로 단일 정규식으로 이 루어진 질의를 다루었다. 본 논문에서는 이러한 다중 정규식을 가지는 질의의 처리에서 1. 뷰의 몸체에서 질의 몸체로의 변수 매핑을 통한 질의 변환과정과 2. 변환된 질의의 각 조각 (conjunct)의 질의 결과를 효율적으로 구하고 결과를 조합하는 두 단계로 이루어진 효율적 인 질의 처리 기법을 제안한다. 제안하는 질의 변환 알고리즘이 안전성(soundness)을 가짐 을 보이고, 질의 처리 기법이 기존 질의 처리 방식에 비하여 효율적임을 보인다.

Keywords

References

  1. Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, and Moshe Y. Vardi, 'Rewriting of regular expressions and regular path queries,' Proceedings of the 18th ACM SIGACT SIGMOD SIGART Sym. on Prin- ciples of Database Systems, 1999 https://doi.org/10.1145/303976.303996
  2. Giosta Grahne and Alex Thomo, 'An optimization technique for answering regular path queries,' International Workshop on the Web and Databases, 2000
  3. Roy Goldman and Jennifer Widom, 'DataGuides: enabling query formulation and optimization in semistructured databases,' Proceedings of the Conference on Very Large Data Bases, 1997
  4. Alon Y. Levy, Alberto O. Mendelzon, Yehoshua Sagiv, and Divesh Srivastava, 'Answering queries using views,' Proceedings of ACM Symposium on Principles of Database Systems, 1995 https://doi.org/10.1145/212433.220198
  5. Bruce G. Lindsay, Laura M. Haas, C. Mohan, Hamid Pirahesh, and Paul F. Wilms, 'Efficiently updating materialized views,' Proceedings of the ACM SIGMOD International Conference on the Management of Data, 1986
  6. P. Larson and H. Yang, 'Computing queries from derived relations,' Proceedings of the Conference on Very Large Data Bases, 1985
  7. Surajit Chaudhuri, Ravi Krishnamurthy, Spyros Potamianos, and Kyuseok Shim, 'Optimizing queries with materialized views,' Proceedings of International Conference on Data Engineering, 1995 https://doi.org/10.1109/ICDE.1995.380392
  8. Elke A. Rundensteiner, 'Multiview: a methodology for supporting multiple views in object-oriented databases,' Proceedings of the Conference on Very Large Data Bases, 1992
  9. Serge Abiteboul and Anthony J. Bonner, 'Objects and views,' Proceedings of the ACM SIGMOD International Conference on the Management of Data, 1991 https://doi.org/10.1145/115790.115830
  10. Dana Florescu, Alon Levy, and Dan Suciu, 'Query containment for conjunctive queries with regular expressions,' Proceedings of ACM Symposium on Principles of Database Systems, 1998 https://doi.org/10.1145/275487.275503
  11. Yannis Papakonstantinou, and Vasilis A. Vassalos, 'Query rewriting using semistru- ctured views,' Proceedings of the ACM SIGMOD International Conference on the Management of Data, 1999
  12. J. McHugh and J. Widom, 'Optimizing branching path expressions,' Technical report, Stanford University Database Group, 1999
  13. J. McHugh and J. Widom, 'Query optimization for XML,' Proceedings of the Conference on Very Large Data Bases, 1999
  14. Mary Fernandez and Dan Suciu, 'Optimizing regular path expressions using graph schemas,' Proceedings of International Conference on Data Engineering, 1998 https://doi.org/10.1109/ICDE.1998.655753
  15. Tova Milo and Dan Suciu, 'Index structures for path expressions,' Proceedings of the International Conference on Database Theory, 1999
  16. Peter Buneman, Susan Davidson, Gerd Hillebrand, and Dan Suciu, 'A query language and optimization techniques for unstructured data,' Proceedings of the ACM SIGMOD Interna- tional Conference on the Management of Data, 1996 https://doi.org/10.1145/235968.233368
  17. S. Abiteboul, Dallan Quass, Jason McHugh, Jennifer Widom, Janet Wiener, 'The lorel query language for semistructured data,' Internaional Journal on Digital Libraries, 1996 https://doi.org/10.1007/s007990050005
  18. A. Deutsch, M. Fernandez, D. Florescu, A. Levy, and D. Suciu, 'Query language for XML,' Proceedings of Eighth International World Wide Web Conference, 1999
  19. L.J. Stockmeyer and A.R. Meyer, 'World problems requiring exponential time,' ACM Symposium on Theory of Computing, 1973
  20. S. Abiteboul, J. McHugh, M. Rys, V. Vassalos, and J. Weiner, 'Incremental maintenance for materialized views over semistructured data,' Proceedings of the Conference on Very Large Data Bases, 1998