In recent decades, driven by global efforts towards sustainability, the priorities of HPC facilities have changed to include maximising energy efficiency besides computing performance. In this regard, a crucial open question is how to accurately predict the contribution of each parallel job to the system’s energy consumption. Accurate estimations in this sense could offer an initial insight into the overall power requirements of the system, and provide meaningful information for, e.g., power-aware scheduling, load balancing, infrastructure design, etc. While ML-based attempts employing large training datasets of past executions may suffer from the high variability of HPC workloads, a more specific knowledge of the nature of the jobs can improve prediction accuracy. In this work, we restrict our attention to the rather pervasive task of linear system resolution. We propose a methodology to build a large dataset of runs (including the measurements coming from physical sensors deployed on a large HPC cluster), and we report a statistical analysis and preliminary evaluation of the efficacy of the obtained dataset when employed to train well-established ML methods aiming to predict the energy footprint of specific software.

Artioli, M., Borghesi, A., Chinnici, M., Ciampolini, A., Colonna, M., De Chiara, D., et al. (2025). C6EnPLS: A High-Performance Computing Job Dataset for the Analysis of Linear Solvers’ Power Consumption. FUTURE INTERNET, 17(5), 1-18 [10.3390/fi17050203].

C6EnPLS: A High-Performance Computing Job Dataset for the Analysis of Linear Solvers’ Power Consumption

Artioli, Marcello;Borghesi, Andrea;Ciampolini, Anna;Loreti, Daniela
2025

Abstract

In recent decades, driven by global efforts towards sustainability, the priorities of HPC facilities have changed to include maximising energy efficiency besides computing performance. In this regard, a crucial open question is how to accurately predict the contribution of each parallel job to the system’s energy consumption. Accurate estimations in this sense could offer an initial insight into the overall power requirements of the system, and provide meaningful information for, e.g., power-aware scheduling, load balancing, infrastructure design, etc. While ML-based attempts employing large training datasets of past executions may suffer from the high variability of HPC workloads, a more specific knowledge of the nature of the jobs can improve prediction accuracy. In this work, we restrict our attention to the rather pervasive task of linear system resolution. We propose a methodology to build a large dataset of runs (including the measurements coming from physical sensors deployed on a large HPC cluster), and we report a statistical analysis and preliminary evaluation of the efficacy of the obtained dataset when employed to train well-established ML methods aiming to predict the energy footprint of specific software.
2025
Artioli, M., Borghesi, A., Chinnici, M., Ciampolini, A., Colonna, M., De Chiara, D., et al. (2025). C6EnPLS: A High-Performance Computing Job Dataset for the Analysis of Linear Solvers’ Power Consumption. FUTURE INTERNET, 17(5), 1-18 [10.3390/fi17050203].
Artioli, Marcello; Borghesi, Andrea; Chinnici, Marta; Ciampolini, Anna; Colonna, Michele; De Chiara, Davide; Loreti, Daniela
File in questo prodotto:
File Dimensione Formato  
futureinternet-17-00203.pdf

accesso aperto

Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 934.88 kB
Formato Adobe PDF
934.88 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1017850
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact