The problem of aggregating scores, so as to provide a ranking of objects in a dataset according to different evaluation criteria, is central to many modern data-intensive applications. Although efficient (instance optimal) algorithms exist to this purpose (such as the Threshold Algorithm TA and its variants) none of them is able to deal with scenarios in which the function used to aggregate scores is only partially specified. This is the typical case when the function is a weighted sum, and the user is unable to provide precise values for the weights. In this paper, we consider the problem of processing multi-source top-k queries, when only constraints, rather than precise values, are available for the weights. After observing that the so-called Fagin's Algorithm (FA) can be adapted to solve the problem, yet only when no constraints at all are present (a case in which our queries will return the k-skyband of the dataset), we introduce the novel FSA algorithm, which we prove to be instance optimal for any set of constraints on the weights. We also propose several optimizations to the basic FSA logic so as to improve execution times. Experimental analysis on both real and synthetic datasets shows that our optimizations are indeed highly effective and that the increased flexibility provided by FSA introduces little overhead with respect to the case of classical top-k queries.

Ciaccia, P., Martinenghi, D. (2018). FA + TA < FSA: Flexible score aggregation. 1515 BROADWAY, NEW YORK, NY 10036-9998 USA : Association for Computing Machinery [10.1145/3269206.3271753].

FA + TA < FSA: Flexible score aggregation

Ciaccia, Paolo;
2018

Abstract

The problem of aggregating scores, so as to provide a ranking of objects in a dataset according to different evaluation criteria, is central to many modern data-intensive applications. Although efficient (instance optimal) algorithms exist to this purpose (such as the Threshold Algorithm TA and its variants) none of them is able to deal with scenarios in which the function used to aggregate scores is only partially specified. This is the typical case when the function is a weighted sum, and the user is unable to provide precise values for the weights. In this paper, we consider the problem of processing multi-source top-k queries, when only constraints, rather than precise values, are available for the weights. After observing that the so-called Fagin's Algorithm (FA) can be adapted to solve the problem, yet only when no constraints at all are present (a case in which our queries will return the k-skyband of the dataset), we introduce the novel FSA algorithm, which we prove to be instance optimal for any set of constraints on the weights. We also propose several optimizations to the basic FSA logic so as to improve execution times. Experimental analysis on both real and synthetic datasets shows that our optimizations are indeed highly effective and that the increased flexibility provided by FSA introduces little overhead with respect to the case of classical top-k queries.
2018
International Conference on Information and Knowledge Management, Proceedings
57
66
Ciaccia, P., Martinenghi, D. (2018). FA + TA < FSA: Flexible score aggregation. 1515 BROADWAY, NEW YORK, NY 10036-9998 USA : Association for Computing Machinery [10.1145/3269206.3271753].
Ciaccia, Paolo; Martinenghi, Davide
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/663724
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 6
social impact