Shrink is an OLAM (On-Line Analytical Mining) operator based on hierarchical clustering, and it has been previously proposed in mono-dimensional form to balance precision with size in the visualization of cubes via pivot tables during OLAP analyses. It can be applied to the cube resulting from a query to decrease its size while controlling the approximation introduced; the idea is to fuse similar facts together and replace them with a single representative fact, respecting the bounds posed by dimension hierarchies. In this paper we propose a multi-dimensional generalization of the shrink operator, where facts are fused along multiple dimensions. Multi-dimensional shrink comes in two flavors: lazy and eager, where the bounds posed by hierarchies are respectively weaker and stricter. Greedy algorithms based on agglomerative clustering are presented for both lazy and eager shrink, and experimentally evaluated in terms of efficiency and effectiveness.
Stefano Rizzi, Matteo Golfarelli, Simone Graziani (2015). An OLAM Operator for Multi-Dimensional Shrink. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 11(3), 68-97 [10.4018/IJDWM.2015070104].
An OLAM Operator for Multi-Dimensional Shrink
RIZZI, STEFANO;GOLFARELLI, MATTEO;GRAZIANI, SIMONE
2015
Abstract
Shrink is an OLAM (On-Line Analytical Mining) operator based on hierarchical clustering, and it has been previously proposed in mono-dimensional form to balance precision with size in the visualization of cubes via pivot tables during OLAP analyses. It can be applied to the cube resulting from a query to decrease its size while controlling the approximation introduced; the idea is to fuse similar facts together and replace them with a single representative fact, respecting the bounds posed by dimension hierarchies. In this paper we propose a multi-dimensional generalization of the shrink operator, where facts are fused along multiple dimensions. Multi-dimensional shrink comes in two flavors: lazy and eager, where the bounds posed by hierarchies are respectively weaker and stricter. Greedy algorithms based on agglomerative clustering are presented for both lazy and eager shrink, and experimentally evaluated in terms of efficiency and effectiveness.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.