The INFN-CNAF data centre hosts the Italian Tier~1 site for the Worldwide LHC Computing Grid (WLCG), while also serving several other research and technological transfer programs. The challenges posed by the upcoming runs of LHC, together with the opportunity of moving the data centre itself to a bigger site, require a thorough redesign of its monitoring system. The large but heterogeneous amount of logging data and metrics produced daily are fundamental for monitoring activities and, once harmonised, can also be used to build Predictive Maintenance models based on Big Data techniques. In this work we describe the Big Data Platform, a new monitoring infrastructure under development at CNAF. The Big Data Platform relies on a modular, highly scalable architecture based on open source technologies and able to exploit modern frameworks such as containerisation and cloud support. It is capable of collecting data from heterogeneous data sources, clean and harmonise them, and store them as JSON files on different solutions, based on the needs of the end user. Data can then be visualised using Kibana, or analysed through a platform based on Jupyter Notebooks.

Rossi Tisbeni, S., CESINI, D., Martelli, B., Carbone, A., Cavallaro, C., Duma, D.C., et al. (2021). A Big Data Platform for heterogeneous data collection and analysis in large-scale data centres [10.22323/1.378.0008].

A Big Data Platform for heterogeneous data collection and analysis in large-scale data centres

Rossi Tisbeni, Simone;CESINI, Daniele;Martelli, Barbara;Gasparetto, Jacopo;Minarini, Francesco;Ronchieri, Elisabetta;
2021

Abstract

The INFN-CNAF data centre hosts the Italian Tier~1 site for the Worldwide LHC Computing Grid (WLCG), while also serving several other research and technological transfer programs. The challenges posed by the upcoming runs of LHC, together with the opportunity of moving the data centre itself to a bigger site, require a thorough redesign of its monitoring system. The large but heterogeneous amount of logging data and metrics produced daily are fundamental for monitoring activities and, once harmonised, can also be used to build Predictive Maintenance models based on Big Data techniques. In this work we describe the Big Data Platform, a new monitoring infrastructure under development at CNAF. The Big Data Platform relies on a modular, highly scalable architecture based on open source technologies and able to exploit modern frameworks such as containerisation and cloud support. It is capable of collecting data from heterogeneous data sources, clean and harmonise them, and store them as JSON files on different solutions, based on the needs of the end user. Data can then be visualised using Kibana, or analysed through a platform based on Jupyter Notebooks.
2021
International Symposium on Grids & Clouds 2021
008
022
Rossi Tisbeni, S., CESINI, D., Martelli, B., Carbone, A., Cavallaro, C., Duma, D.C., et al. (2021). A Big Data Platform for heterogeneous data collection and analysis in large-scale data centres [10.22323/1.378.0008].
Rossi Tisbeni, Simone; CESINI, Daniele; Martelli, Barbara; Carbone, Arianna; Cavallaro, Claudia; Duma, Doina Cristina; Falabella, Antonio; Galletti, M...espandi
File in questo prodotto:
File Dimensione Formato  
ISGC2021_008.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 845.64 kB
Formato Adobe PDF
845.64 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/849466
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact