AutoML has witnessed effective applications in the field of supervised learning – mainly in classification tasks – where the goal is to find the best machine-learning pipeline when a ground truth is available. This is not the case for unsupervised tasks that are by nature exploratory and they are performed to unveil hidden insights. Since there is no right result, analyzing different configurations is more important than returning the best-performing one. When it comes to exploratory unsupervised tasks – such as cluster analysis – different facets of the datasets could be interesting for the data scientist; for instance, data items can be effectively grouped together in different subspaces of features. In this paper, AutoClues explores and returns a dashboard of both relevant and diverse clusterings via AutoML and diversification. AutoML ensures that the explored pipelines for cluster analysis (including pre-processing steps) compute good clusterings. Then, diversification selects, out of the explored clusterings, the ones conveying different clues to the data scientists.

AutoClues: Exploring Clustering Pipelines via AutoML and Diversification / Francia M.; Giovanelli J.; Golfarelli M.. - ELETTRONICO. - 14645:(2024), pp. 246-258. (Intervento presentato al convegno 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024 tenutosi a twn nel 2024) [10.1007/978-981-97-2242-6_20].

AutoClues: Exploring Clustering Pipelines via AutoML and Diversification

Francia M.;Giovanelli J.;Golfarelli M.
2024

Abstract

AutoML has witnessed effective applications in the field of supervised learning – mainly in classification tasks – where the goal is to find the best machine-learning pipeline when a ground truth is available. This is not the case for unsupervised tasks that are by nature exploratory and they are performed to unveil hidden insights. Since there is no right result, analyzing different configurations is more important than returning the best-performing one. When it comes to exploratory unsupervised tasks – such as cluster analysis – different facets of the datasets could be interesting for the data scientist; for instance, data items can be effectively grouped together in different subspaces of features. In this paper, AutoClues explores and returns a dashboard of both relevant and diverse clusterings via AutoML and diversification. AutoML ensures that the explored pipelines for cluster analysis (including pre-processing steps) compute good clusterings. Then, diversification selects, out of the explored clusterings, the ones conveying different clues to the data scientists.
2024
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
246
258
AutoClues: Exploring Clustering Pipelines via AutoML and Diversification / Francia M.; Giovanelli J.; Golfarelli M.. - ELETTRONICO. - 14645:(2024), pp. 246-258. (Intervento presentato al convegno 28th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2024 tenutosi a twn nel 2024) [10.1007/978-981-97-2242-6_20].
Francia M.; Giovanelli J.; Golfarelli M.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/969752
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact