DOI QR코드

DOI QR Code

Equivalence Heuristics for Malleability-Aware Skylines

  • Lofi, Christoph (Institut fur Informationssysteme, Technische Universitat Braunschweig) ;
  • Balke, Wolf-Tilo (Institut fur Informationssysteme, Technische Universitat Braunschweig) ;
  • Guntzer, Ulrich (Institut fur Informatik, Universitat Tubingen)
  • Received : 2012.07.10
  • Accepted : 2012.08.20
  • Published : 2012.09.30

Abstract

In recent years, the skyline query paradigm has been established as a reliable method for database query personalization. While early efficiency problems have been solved by sophisticated algorithms and advanced indexing, new challenges in skyline retrieval effectiveness continuously arise. In particular, the rise of the Semantic Web and linked open data leads to personalization issues where skyline queries cannot be applied easily. We addressed the special challenges presented by linked open data in previous work; and now further extend this work, with a heuristic workflow to boost efficiency. This is necessary; because the new view on linked open data dominance has serious implications for the efficiency of the actual skyline computation, since transitivity of the dominance relationships is no longer granted. Therefore, our contributions in this paper can be summarized as: we present an intuitive skyline query paradigm to deal with linked open data; we provide an effective dominance definition, and establish its theoretical properties; we develop innovative skyline algorithms to deal with the resulting challenges; and we design efficient heuristics for the case of predicate equivalences that may often happen in linked open data. We extensively evaluate our new algorithms with respect to performance, and the enriched skyline semantics.

Keywords

References

  1. P. Hitzler and F. van Harmelen, "A reasonable Semantic Web," Semantic Web, vol. 1, no. 1-2, pp. 39-44, 2010.
  2. C. Bizer, T. Health, and T. Berners-Lee, "Linked data - the story so far," International Journal on Semantic Web and Information Systems, vol. 5, no. 3, pp. 1-22, 2009.
  3. M. Banko and O. Etzioni, "Strategies for lifelong knowledge extraction from the web," Proceedings of the 4th International Conference on Knowledge Capture, Whistler, BC, 2007, pp. 95-102.
  4. W. Shen, A. Doan, J. F. Naughton, and R. Ramakrishnan, "Declarative information extraction using datalog with embedded extraction predicates," Proceedings of the 33rd International Conference on Very Large Data Bases, Vienna, Austria, 2007, pp. 1033-1044.
  5. F. M. Suchanek, M. Sozio, and G. Weikum, "SOFIE: a selforganizing framework for information extraction," Proceedings of the 18th International Conference on World Wide Web, Madrid, Spain, 2009, pp. 631-640.
  6. X. Dong and A. Y. Halevy, "Malleable schemas: a preliminary report," Proceedings of the 8th International Workshop on the Web and Databases, Baltimore, MD, 2005, pp. 139- 144.
  7. X. Dong and A. Y. Halevy, "A platform for personal information management and integration," Proceedings of the Conference on Innovative Data Systems Research, Asilomar, CA, 2005, pp. 119-130.
  8. G. Kasneci, F. M. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum, "NAGA: searching and ranking knowledge," Proceedings of IEEE 24th International Conference on Data Engineering, Cancun, Mexico, 2008, pp. 953-962.
  9. P. DeRose, W. Shen, F. Chen, Y. Lee, D. Burdick, A. H. Doan, and R. Ramakrishnan, "DBLife: a community information management platform for the database research community (demo)," Proceedings of the Conference on Innovative Data Systems Research, Asilomar, CA, 2007, pp. 169-172.
  10. E. Mena, V. Kashyap, A. Illarramendi, and A. P. Sheth, "Imprecise answers in distributed environments: estimation of information loss for multi-ontology based query processing," International Journal of Cooperative Information Systems, vol. 9, no. 4, pp. 403-425, 2000. https://doi.org/10.1142/S0218843000000193
  11. J. Gracia and E. Mena, "Web-based measure of semantic relatedness," Proceedings of the 9th International Conference on Web Information Systems Engineering, Poznan, Poland, 2008, pp. 136-150.
  12. P. Godfrey, R. Shipley, and J. Gryz, "Algorithms and analyses for maximal vector computation," VLDB Journal, vol. 16, no. 1, pp. 5-28, 2007.
  13. C. Lofi, U. Guntzer, and W. T. Balke, "Malleability-aware skyline computation on linked open data," Proceedings of the 17th International Conference on Database Systems for Advanced Applications, Busan, Korea, 2012, pp. 33-47.
  14. T. Cheng and K. C. C. Chang, "Entity search engine: towards agile best-effort information integration over the web," Proceedings of the Conference on Innovative Data Systems Research, Asilomar, CA, 2007, pp. 108-113.
  15. F. Mandreoli, R. Martoglia, G. Villani, and W. Penzo, "Flexible query answering on graph-modeled data," Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, Saint-Petersburg, Russia, 2009, pp. 216-227.
  16. S. Cohen, J. Mamou, Y. Kanza, and Y. Sagiv, "XSEarch: a semantic search engine for XML," Proceedings of the 29th International Conference on Very Large Data Bases, Berlin, Germany, 2003, pp. 45-56.
  17. L. Chen, S. Gao, and K. Anyanwu, "Efficiently evaluating skyline queries on RDF databases," Proceedings of the 8th Extended Semantic Web Conference on the Semanic Web: Research and Applications, Crete, Greece, 2011, pp. 123- 138.
  18. W. T. Balke, U. Guntzer, and C. Lofi, "Eliciting matters: controlling skyline sizes by incremental integration of user preferences," Proceedings of the 12th International Conference on Database Systems for Advanced Applications, Bangkok, Thailand, 2007, pp. 551-562.
  19. S. Borzsonyi, D. Kossmann, and K. Stocker, "The skyline operator," Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, 2001, pp. 421- 430.
  20. P. C. Fishburn, "Intransitive indifference in preference theory: a survey," Operations Research, vol. 18, no. 2, pp. 207- 228, 1970. https://doi.org/10.1287/opre.18.2.207
  21. A. Tversky, "Intransitivity of preferences," Psychological Review, vol. 76, no. 1, pp. 31-48, 1969. https://doi.org/10.1037/h0026750
  22. P. C. Fishburn, "The irrationality of transitivity in social choice," Behavioral Science, vol. 15, no. 2, pp. 119-123, 1970. https://doi.org/10.1002/bs.3830150202
  23. P. Anand, Foundations of Rational Choice under Risk, Oxford, UK: Oxford University press, 1995.
  24. D. Papadias, Y. Tao, G. Fu, and B. Seeger, "Progressive skyline computation in database systems," ACM Transactions on Database Systems, vol. 30, no. 1, pp. 41-82, 2005. https://doi.org/10.1145/1061318.1061320
  25. W. T. Balke, U. Guntzer, and J. X. Zheng, "Efficient distributed skylining for web information systems," Proceeding of the 9th International Conference on Extending Database Technology: Advances in Database Technology, Crete, Greece, 2004, pp. 256-273.
  26. D. Kossmann, F. Ramsak, and S. Rost, "Shooting stars in the sky: an online algorithm for skyline queries," Proceedings of the 28th International Conference on Very Large Data Bases, Hong Kong, China, 2002, pp. 275-286.