Tens of Petabytes of collision and simulated data have been collected and distributed across WLCG sites in Run-1 and Run-2 at LHC. A low latency in transfers among dozens of computing centres is crucial to make an efficient use of the computing resources. Despite on average the desired level of throughput has been successfully achieved to serve the LHC physics programs, it is not uncommon to observe transfer latencies caused by a large variety of causes, from file corruptions to site issues, most of which require operator intervention. To improve on this front, in particular, the CMS experiment equipped the PhEDEx dataset replication system with a system to collect the latency data, and a mechanism to categorise and analyse them promptly, matching them to quick and focussed operators intervention. The transfer latencies data has also been the target of Machine Learning techniques - already used in CMS to study and predict the dataset popularity - and preliminary results on the work in progress in terms of predictability potential of this approach for both applications will be presented and discussed.

Progress in Machine Learning Studies for the CMS Computing Infrastructure

Bonacorsi, Daniele;Diotalevi, Tommaso;
2017

Abstract

Tens of Petabytes of collision and simulated data have been collected and distributed across WLCG sites in Run-1 and Run-2 at LHC. A low latency in transfers among dozens of computing centres is crucial to make an efficient use of the computing resources. Despite on average the desired level of throughput has been successfully achieved to serve the LHC physics programs, it is not uncommon to observe transfer latencies caused by a large variety of causes, from file corruptions to site issues, most of which require operator intervention. To improve on this front, in particular, the CMS experiment equipped the PhEDEx dataset replication system with a system to collect the latency data, and a mechanism to categorise and analyse them promptly, matching them to quick and focussed operators intervention. The transfer latencies data has also been the target of Machine Learning techniques - already used in CMS to study and predict the dataset popularity - and preliminary results on the work in progress in terms of predictability potential of this approach for both applications will be presented and discussed.
2017
Progress in Machine Learning Studies for the CMS Computing Infrastructure
023
034
Bonacorsi, Daniele; Kuznetsov, Valentin; Magini, Nicolo; Diotalevi, Tommaso; Repečka, Aurimas; Matonis, Žygimantas; Kančis, Kipras
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/724426
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact