The S-PIC4CHU project deals with the crucial issue of data preparation for Data Science and Machine Learning, and aims to offer new models and techniques for fighting inaccuracy, noise, uncertainty, bias, and incompleteness of data. While, at the core, the project embraces a semantics-based approach, the proposed data preparation pipeline includes data cleaning —also from the ethical viewpoint—, transformation, reduction as well as deduplication, error detection, missing value imputation, and space transformations for multimedia data. This paper illustrates the advancements on all these fronts, achieved during the first months of work on the project, and sets out the forthcoming actionable objectives.

Alfano, G., Bartolini, I., Calvanese, D., Ciaccia, P., Greco, S., Lanti, D., et al. (2025). S-PIC4CHU: Semantics-Enriched Techniques for Data Preparation in Data Science. CEUR-WS.

S-PIC4CHU: Semantics-Enriched Techniques for Data Preparation in Data Science

Ilaria Bartolini
;
Paolo Ciaccia
;
Marco Patella
;
2025

Abstract

The S-PIC4CHU project deals with the crucial issue of data preparation for Data Science and Machine Learning, and aims to offer new models and techniques for fighting inaccuracy, noise, uncertainty, bias, and incompleteness of data. While, at the core, the project embraces a semantics-based approach, the proposed data preparation pipeline includes data cleaning —also from the ethical viewpoint—, transformation, reduction as well as deduplication, error detection, missing value imputation, and space transformations for multimedia data. This paper illustrates the advancements on all these fronts, achieved during the first months of work on the project, and sets out the forthcoming actionable objectives.
2025
Proceedings of the 4th Italian Conference on Big Data and Data Science (ITADATA), 2025
1
9
Alfano, G., Bartolini, I., Calvanese, D., Ciaccia, P., Greco, S., Lanti, D., et al. (2025). S-PIC4CHU: Semantics-Enriched Techniques for Data Preparation in Data Science. CEUR-WS.
Alfano, Gianvincenzo; Bartolini, Ilaria; Calvanese, Diego; Ciaccia, Paolo; Greco, Sergio; Lanti, Davide; Leonardo Lazzaro, Pasquale; Lenzi, Emilia; Ma...espandi
File in questo prodotto:
File Dimensione Formato  
itaData2025_S_PIC4CHU-3.pdf

accesso aperto

Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 609.38 kB
Formato Adobe PDF
609.38 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1048448
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact