Data Mining techniques are commonly used to extract patterns, like association rules and decision trees, from huge volumes of data. The comparison of patterns is a fundamental issue, which can be exploited, among others, to synthetically measure dissimilarities in evolving or different datasets and to compare the output produced by different data mining algorithms on a same dataset. In this paper, we present the PANDA framework for computing the dissimilarity of both simple and complex patterns, defined upon raw data and other patterns, respectively. In PANDA the problem of comparing complex patterns is decomposed into simpler sub-problems on the component (simple or complex) patterns and so-obtained partial solutions are then smartly aggregated into an overall dissimilarity score. This intrinsically recursive approach grants PANDA with a high flexibility and allows it to easily handle patterns with highly complex structures. PANDA is built upon a few basic concepts so as to be generic and clear to the end user. We demonstrate the generality and flexibility of PANDA by showing how it can be easily applied to a variety of pattern types, including sets of itemsets and clusterings.

The PANDA framework for comparing patterns / Ilaria Bartolini; Paolo Ciaccia; Irene Ntoutsi; Marco Patella; Yannis Theodoridis. - In: DATA & KNOWLEDGE ENGINEERING. - ISSN 0169-023X. - STAMPA. - 68(2):(2009), pp. 244-260. [10.1016/j.datak.2008.10.004]

The PANDA framework for comparing patterns

BARTOLINI, ILARIA;CIACCIA, PAOLO;PATELLA, MARCO;
2009

Abstract

Data Mining techniques are commonly used to extract patterns, like association rules and decision trees, from huge volumes of data. The comparison of patterns is a fundamental issue, which can be exploited, among others, to synthetically measure dissimilarities in evolving or different datasets and to compare the output produced by different data mining algorithms on a same dataset. In this paper, we present the PANDA framework for computing the dissimilarity of both simple and complex patterns, defined upon raw data and other patterns, respectively. In PANDA the problem of comparing complex patterns is decomposed into simpler sub-problems on the component (simple or complex) patterns and so-obtained partial solutions are then smartly aggregated into an overall dissimilarity score. This intrinsically recursive approach grants PANDA with a high flexibility and allows it to easily handle patterns with highly complex structures. PANDA is built upon a few basic concepts so as to be generic and clear to the end user. We demonstrate the generality and flexibility of PANDA by showing how it can be easily applied to a variety of pattern types, including sets of itemsets and clusterings.
2009
The PANDA framework for comparing patterns / Ilaria Bartolini; Paolo Ciaccia; Irene Ntoutsi; Marco Patella; Yannis Theodoridis. - In: DATA & KNOWLEDGE ENGINEERING. - ISSN 0169-023X. - STAMPA. - 68(2):(2009), pp. 244-260. [10.1016/j.datak.2008.10.004]
Ilaria Bartolini; Paolo Ciaccia; Irene Ntoutsi; Marco Patella; Yannis Theodoridis
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/72417
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 8
social impact