Progress in Machine Learning Studies for the CMS Computing Infrastructure

Bonacorsi, Daniele; Kuznetsov, Valentin; Magini, Nicolo; Diotalevi, Tommaso; Repečka, Aurimas; Matonis, Žygimantas; Kančis, Kipras

doi:10.22323/1.293.0023

Tens of Petabytes of collision and simulated data have been collected and distributed across WLCG sites in Run-1 and Run-2 at LHC. A low latency in transfers among dozens of computing centres is crucial to make an efficient use of the computing resources. Despite on average the desired level of throughput has been successfully achieved to serve the LHC physics programs, it is not uncommon to observe transfer latencies caused by a large variety of causes, from file corruptions to site issues, most of which require operator intervention. To improve on this front, in particular, the CMS experiment equipped the PhEDEx dataset replication system with a system to collect the latency data, and a mechanism to categorise and analyse them promptly, matching them to quick and focussed operators intervention. The transfer latencies data has also been the target of Machine Learning techniques - already used in CMS to study and predict the dataset popularity - and preliminary results on the work in progress in terms of predictability potential of this approach for both applications will be presented and discussed.

Bonacorsi, D., Kuznetsov, V., Magini, N., Diotalevi, T., Repečka, A., Matonis, Ž., et al. (2017). Progress in Machine Learning Studies for the CMS Computing Infrastructure [10.22323/1.293.0023].

Progress in Machine Learning Studies for the CMS Computing Infrastructure

Bonacorsi, Daniele;Kuznetsov, Valentin;Magini, Nicolo;Diotalevi, Tommaso;Repečka, Aurimas;Matonis, Žygimantas;Kančis, Kipras

2017

Abstract

Tens of Petabytes of collision and simulated data have been collected and distributed across WLCG sites in Run-1 and Run-2 at LHC. A low latency in transfers among dozens of computing centres is crucial to make an efficient use of the computing resources. Despite on average the desired level of throughput has been successfully achieved to serve the LHC physics programs, it is not uncommon to observe transfer latencies caused by a large variety of causes, from file corruptions to site issues, most of which require operator intervention. To improve on this front, in particular, the CMS experiment equipped the PhEDEx dataset replication system with a system to collect the latency data, and a mechanism to categorise and analyse them promptly, matching them to quick and focussed operators intervention. The transfer latencies data has also been the target of Machine Learning techniques - already used in CMS to study and predict the dataset popularity - and preliminary results on the work in progress in terms of predictability potential of this approach for both applications will be presented and discussed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Titolo del volume
	
				Progress in Machine Learning Studies for the CMS Computing Infrastructure
			
	Pagina iniziale
	
				023
			
	Pagina finale
	
				034
			
	Rivista
	
				POS PROCEEDINGS OF SCIENCE
			
	Codice DOI
	
				https://dx.doi.org/10.22323/1.293.0023
			
	Citazione
	
				Bonacorsi, D., Kuznetsov, V., Magini, N., Diotalevi, T., Repečka, A., Matonis, Ž., et al. (2017). Progress in Machine Learning Studies for the CMS Computing Infrastructure [10.22323/1.293.0023].
			
	Tutti gli autori
	
						Bonacorsi, Daniele; Kuznetsov, Valentin; Magini, Nicolo; Diotalevi, Tommaso; Repečka, Aurimas; Matonis, Žygimantas; Kančis, Kipras...espandi
						
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/724426

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

ND

CRIS Current Research Information System