LI Qiusheng1, WU Yadong1, LIN Maosong2, WANG Song2,3, WANG Haiyang1, FENG Xinmiao1
1. School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang Sichuan 621010, China;
2. School of Information Engineering, Southwest University of Science and Technology, Mianyang Sichuan 621010, China;
3. Institute of Electronic Engineering, China Academy of Engineering Physics, Mianyang Sichuan 621010, China
Top-k dominating query algorithm requires high consumption of time and space to build combined indexes on the attributes, and the query accuracy is low for the data with same attribute values. To solve these problems, a Ranking-k algorithm was given in this paper. The proposed Ranking-k algorithm is a new subspace dominating query algorithm combining the B+-trees with probability distribution model. Firstly, the ordered lists for each data attribute were constructed by the B+-trees. Secondly, the round-robin scheduling algorithm was used to scan ordered attribute lists satisfying skyline criterion. Some candidate tuples were generated and k end tuples were obtained. Thirdly, the dominated scores of end tuples were calculated by using the probability distribution model according to the generated candidate tuples and end tuples. Through iterating the above process, the optimal query results were obtained. The experimental results show that the overall query efficiency of the proposed Ranking-k algorithm is improved by 94.43% compared with the Basic-Scan Algorithm (BSA) and by 7.63% compared with the Differential Algorithm (DA), and the query results of Ranking-k algorithm are much closer to theoretical values in comparison of the Top-k Dominating with Early Pruning (TDEP) algorithm, BSA and DA.
[1] FAGIN R, LOTEM A, NAOR M. Optimal aggregation algorithms for middleware [C]// PODS'01: Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. New York: ACM, 2001: 102-113. [2] VAGELIS H, NICK K, PAPAKONSTANTINOU Y. PREFER: a system for the efficient execution of multi-parametric ranked queries [C]// Proceeding of the 2001 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2001: 259-270. [3] BORZSONYI S, KOSSMANN D, STOCKER K. The skyline operator [C]// Proceedings of the 17th International Conference on Data Engineering. Piscataway: IEEE, 2001: 421-430. [4] PAPADIAS D, TAO Y, FU G, et al. Progressive skyline computa- tion in database systems [J]. ACM Transactions on Database Systems, 2005, 30(1): 41-82. [5] YIU M, MAMOULIS N. Efficient processing of Top-k dominating queries on multi-dimensional data [C]// VLDB'07: Proceedings of the 33rd International Conference on Very Large Data Bases. [S.l.]: VLDB Endowment, 2007: 483-494. [6] YIU M, MAMOULIS N. Multi-dimensional Top-k dominating que-ries [J]. The International Journal on Very Large Data Bases, 2009, 18(3): 695-718. [7] KONTAKI M, PAPADOPOULOS Y. Continuous Top-k dominating queries [J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 24(5): 840-853. [8] ZHANG W, LIN X, ZHANG Y, et al. Threshold-based probabilistic Top-k dominating queries [J]. The International Journal on Very Large Data Bases, 2010, 19(2): 283-305. [9] TIAKAS E, PAPADOPOULOS A N, MANOLOPOULOS Y. Pro-gressive processing of subspace dominating queries [J]. The International Journal on Very Large Data Bases, 2011, 20(6): 921-948. [10] HAN X, YANG D, LI J. An efficient Top-k dominating algorithm on massive data title [J]. Chinese Journal of Computers, 2013,33(8): 1405-1417.(韩希先,杨东华,李建中.TKEP:海量数据上一种有效的Top-k查询处理算法[J].计算机学报,2013,33(8): 1405-1417.) [11] TAO Y, XIAO X, PEI J. SUBSKY: efficient computation of skylines in subspaces [C]// Proceedings of the 22nd International Conference on Data Engineering. Piscataway: IEEE, 2006: 65-76. [12] XIA T, ZHANG D. Refreshing the sky: the compressed skycube with efficient support for frequent updates [C]// Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2006: 491-502. [13] YUAN Y, LIN X, LIU Q, et al. Efficient computation of the skyline cube [C]// Proceedings of the 31st International Conference on Very Large Data Bases. [S.l.]: VLDB Endowment, 2005: 241-252. [14] TAO Y, XIAO X, PEI J. Efficient skyline and Top-k retrieval in subspaces [J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(8): 1072-1088. [15] ZHANG Z, LU H, OOI C, et al. Understanding the meaning of a shifted sky: a general framework on extending skyline query [J]. The International Journal on Very Large Data Bases, 2010, 19(2): 181-201.