In a deterministic relation, tuple u dominates tuple v if u is no worse than v on all attributes, and better than v on at least one attribute. This concept is at the heart of skyline queries, that return the set of undominated tuples. In this paper we extend the notion of skyline to probabilistic relations by generalizing to this context the definition of tuple domination. Our approach is parametric in the semantics for ranking probabilistic tuples and, being it based on order-theoretic principles, preserves the three properties the skyline has in the deterministic case: it equals the union of all top1 results of monotone scoring functions, it requires no additional parameter, and it is insensitive to attribute scales. We then show how domination among probabilistic tuples can be efficiently checked by means of a set of rules. We detail rules for the cases in which tuples are ranked using either the "expected rank" or the "expected score" semantics, and explain how the approach can be applied to other semantics as well. Since computing the skyline of a probabilistic relation is a time-consuming task, we introduce algorithms for checking domination rules in an optimized way. Experiments show that these algorithms can reduce execution times with respect to a naive evaluation.
I. Bartolini, P. Ciaccia, M. Patella (2013). The Skyline of a Probabilistic Relation. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 25(7), 1656-1669 [10.1109/TKDE.2012.102].
The Skyline of a Probabilistic Relation
BARTOLINI, ILARIA;CIACCIA, PAOLO;PATELLA, MARCO
2013
Abstract
In a deterministic relation, tuple u dominates tuple v if u is no worse than v on all attributes, and better than v on at least one attribute. This concept is at the heart of skyline queries, that return the set of undominated tuples. In this paper we extend the notion of skyline to probabilistic relations by generalizing to this context the definition of tuple domination. Our approach is parametric in the semantics for ranking probabilistic tuples and, being it based on order-theoretic principles, preserves the three properties the skyline has in the deterministic case: it equals the union of all top1 results of monotone scoring functions, it requires no additional parameter, and it is insensitive to attribute scales. We then show how domination among probabilistic tuples can be efficiently checked by means of a set of rules. We detail rules for the cases in which tuples are ranked using either the "expected rank" or the "expected score" semantics, and explain how the approach can be applied to other semantics as well. Since computing the skyline of a probabilistic relation is a time-consuming task, we introduce algorithms for checking domination rules in an optimized way. Experiments show that these algorithms can reduce execution times with respect to a naive evaluation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.