Browse > Article

Abstracted Partitioned-Layer Index: A Top-k Query Processing Method Reducing the Number of Random Accesses of the Partitioned-Layer Index  

Heo, Jun-Seok (한국과학기술원 전산학과)
Publication Information
Abstract
Top-k queries return k objects that users most want in the database. The Partitioned-Layer Index (simply, the PL -index) is a representative method for processing the top-k queries efficiently. The PL-index partitions the database into a number of smaller databases, and then, for each partitioned database, constructs a list of sublayers over the partitioned database. Here, the $i^{th}$ sublayer in the partitioned database has the objects that can be the top-i object in the partitioned one. To retrieve top k results, the PL-index merges the sublayer lists depending on the user's query. The PL-index has the advantage of reading a very small number of objects from the database when processing the queries. However, since many random accesses occur in merging the sublayer lists, query performance of the PL-index is not good in environments like disk-based databases. In this paper, we propose the Abstracted Partitioned-Layer Index (simply, the APL-index) that significantly improves the query performance of the PL-index in disk-based environments by reducing the number of random accesses. First, by abstracting each sublayer of the PL -index into a virtual (point) object, we transform the lists of sublayers into those of virtual objects (ie., the APL-index). Then, we virtually process the given query by using the APL-index and, accordingly, predict sublayers that are to be read when actually processing the query. Next, we read the sublayers predicted from each sublayer list at a time. Accordingly, we reduce the number of random accesses that occur in the PL-index. Experimental results using synthetic and real data sets show that our APL-index proposed can significantly reduce the number of random accesses occurring in the PL-index.
Keywords
Top-k query processing; Partitioned-Layer Index; Random access;
Citations & Related Records
연도 인용수 순위
  • Reference
1 R. Fagin, A. Lotem, and M. Naor, "Optimal Aggregation Algorithms for Middleware," In Proc. ACM Symposium on Principles of Database Systems (PODS), Santa Barbara, California, May 2001.
2 D. Xin, C. Chen, and J. Han, "Towards Robust Indexing for Ranked Queries," In Proc. Int'l Conf. on Very Large Data Bases (VLDB), Seoul, Korea, Sept. 2006.
3 Y. C. Chang, L. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J. R. Smith, "The Onion Technique: Indexing for Linear Optimization Queries," In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Dallas, Texas, May 2000.
4 G. Das, D. Gunopulos, N. Koudas, and D. Tsirogiannis, "Answering Top-k Queries Using Views," In Proc. Int'l Conf. on Very Large Data Bases (VLDB), Seoul, Korea, Sept. 2006.
5 Yi, K., Yu, H., Yang, J., Xia, G., and Chen, Y., "Efficient Maintenance of Materialized Top-k Views," In Proc. Int'l Conf. on Data Engineering (ICDE), Bangalore, India, Mar. 2003.
6 C.-Y. Chan, P.-K. Eng, and K.-L. Tan, "Stratified computation of skylines with partially-ordered domains," In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, pp. 203-214, Baltimore, Maryland, June 2005.
7 D. Papadias, Y. Tao, G. Fu, and B. Seeger, "Progressive skyline computation in database systems," ACM Trans. on Database Systems, Vol.30, No.1, 2005.
8 G. Beskales, M. A. Soliman, and I. F. Ilyas, "Efficient search for the top-k probable nearest neighbors in uncertain databases," In Proc. Int'l Conf. on Very Large Data Bases (VLDB), Auckland, New Zealand, Aug. 2008.
9 M. Hua, J. Pei, W. Zhang, and X. Lin, "Ranking queries on uncertain data: a probabilistic threshold approach," In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Vancouver, Canada, June 2008.
10 S. Borzsonyi, D. Kossmann, and K. Stocker, "The Skyline Operator," In Proc. Int'l Conf. on Data Engineering (ICDE), Heidelberg, Germany, Apr. 2001.
11 V. Hristidis and Y. Papakonstantinou, "Algorithms and applications for answering ranked queries using ranked views," The VLDB Journal, Vol.13, No.1, 2004.
12 J.-S. Heo, K.-Y. Whang, M.-S. Kim, Y.-R. Kim, and I.-Y. Song, "The Partitioned- Layer Index: Answering Monotone Top-k Queries Using the Convex Skyline and Partitioning- Merging Technique," Information Sciences, Vol.179, No.9, 2009
13 C. Li, K. C.-C. Chang, I. F. Ilyas, and S. Song, "RankSQL: Query Algebra and Optimization for Relational Top-k Queries," In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Baltimore, Maryland, June 2005.
14 C. Li, K. C.-C. Chang, and I. F. Ilyas, "Supporting ad-hoc ranking aggregates," In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Chicago, IL, June 2006.
15 J.-S. Heo, J. Cho, and K.-W. Whang, "The Hybrid-Layer Index: A Synergic Approach to Answering Top-k Queries in Arbitrary Subspaces," In Proc. 26th Int'l Conf. on Data Engineering (ICDE), Long Beach, California, Mar. 2010.
16 B. Barber, D. Dobkin, and H. Huhdanpaa, "The Quickhull Algorithm for Convex Hulls," ACM Trans. on Mathematical Software, Vol. 22, No.4, 1996.
17 M. Berg, M. Kreveld, M. Overmars, and O. Schwarzkopf, Computational Geometry: Algorithms and Applications, 2nd ed., Springer-Verlag, 2000.
18 S. G. Gass, Linear Programming: Method and Applications, 5th ed. An International Thomson Publishing Company, 1985.