Information flooding may occur during an OLAP session when the user drills down her cube up to a very fine-grained level, because the huge number of facts returned makes it very hard to analyze them using a pivot table. To overcome this problem we propose a novel OLAP operation, called shrink, aimed at balancing data precision with data size in cube visualization via pivot tables. The shrink operation fuses slices of similar data and replaces them with a single representative slice, respecting the constraints suggested by dimension hierarchies, until the result has either size or error smaller than a given threshold. An optimal computation of the shrink operation has exponential complexity, so we present both a greedy algorithm based on agglomerative clustering, which returns a sub-optimal solution, and a branch-and-bound algorithm that returns an optimal solution. Finally, we discuss some experimental results to evaluate the shrink operation from the efficiency and effectiveness point of view.

Simone Graziani, Matteo Golfarelli, Stefano Rizzi (2014). Shrink: An OLAP Operation for Balancing Precision and Size of Pivot Tables. DATA & KNOWLEDGE ENGINEERING, 93, 19-41 [10.1016/j.datak.2014.07.004].

Shrink: An OLAP Operation for Balancing Precision and Size of Pivot Tables

GRAZIANI, SIMONE;GOLFARELLI, MATTEO;RIZZI, STEFANO
2014

Abstract

Information flooding may occur during an OLAP session when the user drills down her cube up to a very fine-grained level, because the huge number of facts returned makes it very hard to analyze them using a pivot table. To overcome this problem we propose a novel OLAP operation, called shrink, aimed at balancing data precision with data size in cube visualization via pivot tables. The shrink operation fuses slices of similar data and replaces them with a single representative slice, respecting the constraints suggested by dimension hierarchies, until the result has either size or error smaller than a given threshold. An optimal computation of the shrink operation has exponential complexity, so we present both a greedy algorithm based on agglomerative clustering, which returns a sub-optimal solution, and a branch-and-bound algorithm that returns an optimal solution. Finally, we discuss some experimental results to evaluate the shrink operation from the efficiency and effectiveness point of view.
2014
Simone Graziani, Matteo Golfarelli, Stefano Rizzi (2014). Shrink: An OLAP Operation for Balancing Precision and Size of Pivot Tables. DATA & KNOWLEDGE ENGINEERING, 93, 19-41 [10.1016/j.datak.2014.07.004].
Simone Graziani; Matteo Golfarelli; Stefano Rizzi
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/341921
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 19
  • ???jsp.display-item.citation.isi??? 11
social impact