Data platforms are state-of-the-art solutions for implementing data-driven applications and analytics. They facilitate the ingestion, storage, management, and exploitation of big data. Data platforms are built on top of complex ecosystems of services answering different data needs and requirements; such ecosystems are offered by different providers (e.g., Amazon AWS and Microsoft Azure). However, when it comes to engineering data platforms, no unifying strategy and methodology is available yet, and the design is mainly left to the expertise of practitioners in the field. Service providers simply expose a long list of interoperable and alternative engines, making it hard to select the optimal subset without a deep knowledge of the ecosystem. A more effective design approach starts with knowledge of the data transformation and exploitation processes that the platform should support. In this paper, we sketch a computer-aided design methodology and then focus on the selection of the optimal services needed to implement such processes. We show that our approach lightens the design of data platforms and enables an unbiased selection and comparison of solutions even through different service ecosystems.
Francia, M., Golfarelli, M., Pasini, M. (2025). Process-driven design of cloud data platforms. INFORMATION SYSTEMS, 131, 1-12 [10.1016/j.is.2025.102527].
Process-driven design of cloud data platforms
Francia M.
;Golfarelli M.;Pasini M.
2025
Abstract
Data platforms are state-of-the-art solutions for implementing data-driven applications and analytics. They facilitate the ingestion, storage, management, and exploitation of big data. Data platforms are built on top of complex ecosystems of services answering different data needs and requirements; such ecosystems are offered by different providers (e.g., Amazon AWS and Microsoft Azure). However, when it comes to engineering data platforms, no unifying strategy and methodology is available yet, and the design is mainly left to the expertise of practitioners in the field. Service providers simply expose a long list of interoperable and alternative engines, making it hard to select the optimal subset without a deep knowledge of the ecosystem. A more effective design approach starts with knowledge of the data transformation and exploitation processes that the platform should support. In this paper, we sketch a computer-aided design methodology and then focus on the selection of the optimal services needed to implement such processes. We show that our approach lightens the design of data platforms and enables an unbiased selection and comparison of solutions even through different service ecosystems.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0306437925000122-main.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale / Version Of Record
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
1.24 MB
Formato
Adobe PDF
|
1.24 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.