Skylines assume that all attributes are equally important, as each dimension can always be traded off for another. Prioritized skylines (p-skylines) take into account non-compensatory preferences, where some dimensions are deemed more important than others, and trade-offs are constrained by the relative importance of the attributes involved. In this paper we show that querying using non-compensatory preferences is computationally efficient. We focus on preferences that are representable with p-expressions, and develop an efficient in-memory divide-and-conquer algorithm for answering p-skyline queries. Our algorithm is output-sensitive; this is very desirable in the context of preference queries, since the output is expected to be, on average, only a small fraction of the input. We prove that our method is well behaved in both the worst- and the average-case scenarios. Additionally, we develop a general framework for benchmarking p-skyline algorithms, showing how to sample prioritized preference relations uniformly, and how to highlight the effect of data correlation on performance. We conclude our study with extensive experimental results.
Meneghetti, N., Mindolin, D., Ciaccia, P., Chomicki, J. (2015). Output-sensitive evaluation of prioritized skyline queries. New York, NY : Association for Computing Machinery [10.1145/2723372.2723736].
Output-sensitive evaluation of prioritized skyline queries
CIACCIA, PAOLO;
2015
Abstract
Skylines assume that all attributes are equally important, as each dimension can always be traded off for another. Prioritized skylines (p-skylines) take into account non-compensatory preferences, where some dimensions are deemed more important than others, and trade-offs are constrained by the relative importance of the attributes involved. In this paper we show that querying using non-compensatory preferences is computationally efficient. We focus on preferences that are representable with p-expressions, and develop an efficient in-memory divide-and-conquer algorithm for answering p-skyline queries. Our algorithm is output-sensitive; this is very desirable in the context of preference queries, since the output is expected to be, on average, only a small fraction of the input. We prove that our method is well behaved in both the worst- and the average-case scenarios. Additionally, we develop a general framework for benchmarking p-skyline algorithms, showing how to sample prioritized preference relations uniformly, and how to highlight the effect of data correlation on performance. We conclude our study with extensive experimental results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.