Numerical dependencies (NDs) are database constraints that limit the number of distinct Y -values that can appear together with any X-value, where both X and Y are sets of attributes in a relation schema. While it is known that NDs are not finitely axiomatizable, there is no study on how to efficiently derive NDs using a set of sound (yet necessarily incomplete) rules. In this paper, after proving that the entailment problem for NDs has exponential space complexity, we prove that, given a set of inference rules similar to those used for functional dependencies, the membership problem for NDs is NP-complete. We then provide a graph-based characterization of NDs, which is exploited to design an efficient branch & bound algorithm for ND derivation. Our algorithm adopts several optimization strategies that provide considerable speed-up over a naıve approach, as confirmed by the results of extensive tests we made for efficiency and effectiveness using six different datasets.
Efficient Derivation of Numerical Dependencies / P. Ciaccia; M. Golfarelli; S. Rizzi. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - STAMPA. - 38:3(2013), pp. 410-429. [10.1016/j.is.2012.07.007]
Efficient Derivation of Numerical Dependencies
CIACCIA, PAOLO;GOLFARELLI, MATTEO;RIZZI, STEFANO
2013
Abstract
Numerical dependencies (NDs) are database constraints that limit the number of distinct Y -values that can appear together with any X-value, where both X and Y are sets of attributes in a relation schema. While it is known that NDs are not finitely axiomatizable, there is no study on how to efficiently derive NDs using a set of sound (yet necessarily incomplete) rules. In this paper, after proving that the entailment problem for NDs has exponential space complexity, we prove that, given a set of inference rules similar to those used for functional dependencies, the membership problem for NDs is NP-complete. We then provide a graph-based characterization of NDs, which is exploited to design an efficient branch & bound algorithm for ND derivation. Our algorithm adopts several optimization strategies that provide considerable speed-up over a naıve approach, as confirmed by the results of extensive tests we made for efficiency and effectiveness using six different datasets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.