As a joint effort from various communities involved in the Worldwide LHC Computing Grid, the Operational Intelligence project aims at increasing the level of automation in computing operations and reducing human interventions. The distributed computing systems currently deployed by the LHC experiments have proven to be mature and capable of meeting the experimental goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters, and operational teams is needed to efficiently manage such heterogenous infrastructures. Under the scope of the Operational Intelligence project, experts from several areas have gathered to propose and work on “smart” solutions. Machine learning, data mining, log analysis, and anomaly detection are only some of the tools we have evaluated for our use cases. In this community study contribution, we report on the development of a suite of operational intelligence services to cover various use cases: workload management, data management, and site operations.

Preparing Distributed Computing Operations for the HL-LHC Era With Operational Intelligence

Bonacorsi D.;Clissa L.;Decker de Sousa L.;Diotalevi T.;Giommi L.;Rinaldi L.;
2022

Abstract

As a joint effort from various communities involved in the Worldwide LHC Computing Grid, the Operational Intelligence project aims at increasing the level of automation in computing operations and reducing human interventions. The distributed computing systems currently deployed by the LHC experiments have proven to be mature and capable of meeting the experimental goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters, and operational teams is needed to efficiently manage such heterogenous infrastructures. Under the scope of the Operational Intelligence project, experts from several areas have gathered to propose and work on “smart” solutions. Machine learning, data mining, log analysis, and anomaly detection are only some of the tools we have evaluated for our use cases. In this community study contribution, we report on the development of a suite of operational intelligence services to cover various use cases: workload management, data management, and site operations.
Di Girolamo A.; Legger F.; Paparrigopoulos P.; Schovancova J.; Beermann T.; Boehler M.; Bonacorsi D.; Clissa L.; Decker de Sousa L.; Diotalevi T.; Giommi L.; Grigorieva M.; Giordano D.; Hohn D.; Javurek T.; Jezequel S.; Kuznetsov V.; Lassnig M.; Mageirakos V.; Olocco M.; Padolski S.; Paltenghi M.; Rinaldi L.; Sharma M.; Tisbeni S.R.; Tuckus N.
File in questo prodotto:
File Dimensione Formato  
Girolamo_et_al-2022-Frontiers_in_Big_Data.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 1.39 MB
Formato Adobe PDF
1.39 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/850109
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact