In a deterministic relation, tuple u dominates tuple v if u is no worse than v on all attributes, and better than v on at least one attribute. This concept is at the heart of skyline queries, that return the set of undominated tuples. In this paper we extend the notion of skyline to probabilistic relations by generalizing to this context the definition of tuple domination. Our approach is parametric in the semantics for ranking probabilistic tuples and, being it based on order-theoretic principles, preserves the three properties the skyline has in the deterministic case: it equals the union of all top1 results of monotone scoring functions, it requires no additional parameter, and it is insensitive to attribute scales. We then show how domination among probabilistic tuples can be efficiently checked by means of a set of rules. We detail rules for the cases in which tuples are ranked using either the "expected rank" or the "expected score" semantics, and explain how the approach can be applied to other semantics as well. Since computing the skyline of a probabilistic relation is a time-consuming task, we introduce algorithms for checking domination rules in an optimized way. Experiments show that these algorithms can reduce execution times with respect to a naive evaluation.

The Skyline of a Probabilistic Relation

BARTOLINI, ILARIA;CIACCIA, PAOLO;PATELLA, MARCO
2013

Abstract

In a deterministic relation, tuple u dominates tuple v if u is no worse than v on all attributes, and better than v on at least one attribute. This concept is at the heart of skyline queries, that return the set of undominated tuples. In this paper we extend the notion of skyline to probabilistic relations by generalizing to this context the definition of tuple domination. Our approach is parametric in the semantics for ranking probabilistic tuples and, being it based on order-theoretic principles, preserves the three properties the skyline has in the deterministic case: it equals the union of all top1 results of monotone scoring functions, it requires no additional parameter, and it is insensitive to attribute scales. We then show how domination among probabilistic tuples can be efficiently checked by means of a set of rules. We detail rules for the cases in which tuples are ranked using either the "expected rank" or the "expected score" semantics, and explain how the approach can be applied to other semantics as well. Since computing the skyline of a probabilistic relation is a time-consuming task, we introduce algorithms for checking domination rules in an optimized way. Experiments show that these algorithms can reduce execution times with respect to a naive evaluation.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
I. Bartolini; P. Ciaccia; M. Patella
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/120945
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 18
  • ???jsp.display-item.citation.isi??? 16
social impact