DOI QR코드

DOI QR Code

Minimizing the MOLAP/ROLAP Divide: You Can Have Your Performance and Scale It Too

  • Eavis, Todd (Department of Computer Science and Software Engineering, Concordia University) ;
  • Taleb, Ahmad (College of Computer Science and Information Systems, Najran University)
  • Received : 2013.01.09
  • Accepted : 2013.02.04
  • Published : 2013.03.30

Abstract

Over the past generation, data warehousing and online analytical processing (OLAP) applications have become the cornerstone of contemporary decision support environments. Typically, OLAP servers are implemented on top of either proprietary array-based storage engines (MOLAP) or as extensions to conventional relational DBMSs (ROLAP). While MOLAP systems do indeed provide impressive performance on common analytics queries, they tend to have limited scalability. Conversely, ROLAP's table oriented model scales quite nicely, but offers mediocre performance at best relative to the MOLAP systems. In this paper, we describe a storage and indexing framework that aims to provide both MOLAP like performance and ROLAP like scalability by essentially combining some of the best features from both. Based upon a combination of R-trees and bitmap indexes, the storage engine has been integrated with a robust OLAP query engine prototype that is able to fully exploit the efficiency of the proposed storage model. Specifically, it utilizes an OLAP algebra coupled with a domain specific query optimizer, to map user queries directly to the storage and indexing framework. Experimental results demonstrate that not only does the design improve upon more naive approaches, but that it does indeed offer the potential to optimize both query performance and scalability.

Keywords

References

  1. T. Eavis and A. Taleb, "Towards a scalable, performance-oriented OLAP storage engine," Database Systems for Advanced Applications, Lecture Notes in Computer Science Volume 7239, S. G. Lee, Z. Peng, Z. Zhou, editors, Heidelberg: Springer Berlin, pp. 185-202, 2012.
  2. J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh, "Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals," Data Mining and Knowledge Discovery, vol. 1, no. 1, pp. 29-53, 1997. https://doi.org/10.1023/A:1009726021843
  3. Y. Sismanis, A. Deligiannakis, N. Roussopoulos, and Y. Kotidis, "Dwarf: shrinking the PetaCube," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Madison, WI, 2002, pp. 464-475.
  4. L. V. S. Lakshmanan, J. Pei, and Y. Zhao, "QC-trees: an efficient summary structure for semantic OLAP," in Proceedings of the ACM SIGMOD International Conference on Management of Data, San Diego, CA, 2003, pp. 64-75.
  5. K. Morfonios and Y. Ioannidis, "CURE for cubes: cubing using a ROLAP engine," in Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Korea, 2006, pp. 379-390.
  6. H. Gupta, V. Harinarayan, A. Rajaraman, and J. D. Ullman, "Index selection for OLAP," in Proceedings of the 13th International Conference on Data Engineering, Birmingham, UK, 1997, pp. 208-219.
  7. N. Roussopoulos, Y. Kotidis, and M. Roussopoulos, "Cubetree: organization of and bulk incremental updates on the data cube," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Tucson, AZ, 1997, pp. 89-99.
  8. H. Plattner, "A common database approach for OLTP and OLAP using an in-memory column database," in Proceedings of the ACM SIGMOD International Conference on Management of Data, Providence, RI, 2009, pp. 1-2.
  9. J. Dean and S. Ghemawat, "MapReduce: a flexible data processing tool," Communications of the ACM, vol. 53, no. 1, pp. 72-77, 2010.
  10. A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, and A. Rasin, "HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads," Proceedings of the VLDB Endowment, vol. 2, no. 1, pp. 922-933, 2009 https://doi.org/10.14778/1687627.1687731
  11. M. Stonebraker, D. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin, "MapReduce and parallel DBMSs: friends or foes?," Communications of the ACM, vol. 53, no. 1, pp. 64-71, 2010. https://doi.org/10.1145/1629175.1629197
  12. Pentaho Mondrian Project, http://mondrian.pentaho.com/.
  13. Microsoft SQL Server Analysis Services, http://www.microsoft. com/sqlserver/2008/en/us/analysis-services.aspx.
  14. Oracle OLAP, http://www.oracle.com/technology/products/bi/ olap/index.html.
  15. E. Malinowski and E. Zimanyi, "Hierarchies in a multidimensional model: from conceptual modeling to logical representation," Data & Knowledge Engineering, vol. 59, no. 2, pp. 348-377, 2006. https://doi.org/10.1016/j.datak.2005.08.003
  16. T. Eavis and A. Taleb, "MapGraph: efficient methods for complex OLAP hierarchies," in Proceedings of the 16th ACM Conference on Information and knowledge Management, Lisbon, Portugal, 2007, pp. 465-474.
  17. Oracle Berkeley DB 11g, http://www.oracle.com/technetwork/ database/berkeleydb/overview/index.html.
  18. FastBit: an efficient compressed bitmap index technology, https://sdm.lbl.gov/fastbit/.
  19. T. Eavis and D. Cueva, "The LBF R-tree: efficient multidimensional indexing with graceful degradation," in Proceedings of the 11th International Database Engineering and Applications Symposium, Banff, AB, 2007, pp. 241-250.
  20. O. Romero and A. Abello, "On the need of a reference algebra for OLAP," in Proceedings of the 9th International Conference on Data Warehousing and Knowledge Discovery, Regensburg, Germany, 2007, pp. 99-110.
  21. T. Eavis, H. Tabbara, and A. Taleb, "The NOX framework: native language queries for business intelligence applications," in Proceedings of the 12th International Conference on Data Warehousing and Knowledge Discovery, Bilbao, Spain, 2010, pp. 172-189.
  22. A. Taleb, T. Eavis, and H. Tabbara, "The NOX OLAP query model: from algebra to execution," in Proceedings of the 13th International Conference on Data Warehousing and Knowledge Discovery, Toulouse, France, 2011, pp. 167-183.
  23. F. Dehne, T. Eavis, and A. Rau-Chaplin, "RCUBE: parallel multi-dimensional ROLAP indexing," International Journal of Data Warehousing and Mining, vol. 4, no. 3, pp. 1-14, 2008.