Data platforms are state-of-the-art solutions to implement data-driven applications and analytics, since they facilitate the ingestion, storage, management, and exploitation of big data. Data platforms are built on top of complex ecosystems of services answering different data needs and requirements; such ecosystems are offered by different providers (e.g., Amazon AWS and Apache). However, when it comes to engineering data platforms, no unifying strategy and methodology is there yet, and the design is mainly left to the expertise of practitioners in the field. In particular, service providers simply expose a long list of interoperable and alternative engines, making it hard to select the optimal subset without a deep knowledge of the ecosystem. A more effective approach to the design starts from the knowledge of the data transformation and exploitation processes that should be supported by the platform. In this paper, we sketch a computer-aided design methodology and then focus on the selection of the optimal services needed to implement such processes. We believe that our approach lightens the design of data platforms and enables an unbiased selection and comparison of solutions even through different service ecosystems.

Towards a Process-Driven Design of Data Platforms / Francia M.; Golfarelli M.; Pasini M.. - ELETTRONICO. - 3653:(2024), pp. 28-35. (Intervento presentato al convegno 26th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data, DOLAP 2024 tenutosi a ita nel 2024).

Towards a Process-Driven Design of Data Platforms

Francia M.
;
Golfarelli M.;Pasini M.
2024

Abstract

Data platforms are state-of-the-art solutions to implement data-driven applications and analytics, since they facilitate the ingestion, storage, management, and exploitation of big data. Data platforms are built on top of complex ecosystems of services answering different data needs and requirements; such ecosystems are offered by different providers (e.g., Amazon AWS and Apache). However, when it comes to engineering data platforms, no unifying strategy and methodology is there yet, and the design is mainly left to the expertise of practitioners in the field. In particular, service providers simply expose a long list of interoperable and alternative engines, making it hard to select the optimal subset without a deep knowledge of the ecosystem. A more effective approach to the design starts from the knowledge of the data transformation and exploitation processes that should be supported by the platform. In this paper, we sketch a computer-aided design methodology and then focus on the selection of the optimal services needed to implement such processes. We believe that our approach lightens the design of data platforms and enables an unbiased selection and comparison of solutions even through different service ecosystems.
2024
CEUR Workshop Proceedings
28
35
Towards a Process-Driven Design of Data Platforms / Francia M.; Golfarelli M.; Pasini M.. - ELETTRONICO. - 3653:(2024), pp. 28-35. (Intervento presentato al convegno 26th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data, DOLAP 2024 tenutosi a ita nel 2024).
Francia M.; Golfarelli M.; Pasini M.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/967518
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact