Thermal-Aware design and online optimization of the cooling effort are becoming increasingly important in current and future high-performance computing (HPC) systems. A fundamental requirement to effectively develop such techniques is the availability of distributed and compact models representing the system thermal behavior. System identification algorithms allow to extract models directly from the thermal response of the target device. This article proposes a novel thermal identification approach for real, in-production HPC systems, which is capable of extracting thermal models from a computing node affected by quantization noise on the temperature measurements as well as operating in the free-cooling mode, with variable ambient temperature. The approach allows also to identify the physical floorplan of the CPU dies in supercomputing nodes. The effectiveness of the proposed methodology has been tested on a node of the CINECA Galileo Tier-1 supercomputer system.

Thermal Model Identification of Computing Nodes in High-Performance Computing Systems / Diversi R.; Bartolini A.; Benini L.. - In: IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS. - ISSN 0278-0046. - ELETTRONICO. - 67:9(2020), pp. 8863115.7778-8863115.7788. [10.1109/TIE.2019.2945277]

Thermal Model Identification of Computing Nodes in High-Performance Computing Systems

Diversi R.
Primo
;
Bartolini A.
Secondo
;
Benini L.
Ultimo
2020

Abstract

Thermal-Aware design and online optimization of the cooling effort are becoming increasingly important in current and future high-performance computing (HPC) systems. A fundamental requirement to effectively develop such techniques is the availability of distributed and compact models representing the system thermal behavior. System identification algorithms allow to extract models directly from the thermal response of the target device. This article proposes a novel thermal identification approach for real, in-production HPC systems, which is capable of extracting thermal models from a computing node affected by quantization noise on the temperature measurements as well as operating in the free-cooling mode, with variable ambient temperature. The approach allows also to identify the physical floorplan of the CPU dies in supercomputing nodes. The effectiveness of the proposed methodology has been tested on a node of the CINECA Galileo Tier-1 supercomputer system.
2020
Thermal Model Identification of Computing Nodes in High-Performance Computing Systems / Diversi R.; Bartolini A.; Benini L.. - In: IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS. - ISSN 0278-0046. - ELETTRONICO. - 67:9(2020), pp. 8863115.7778-8863115.7788. [10.1109/TIE.2019.2945277]
Diversi R.; Bartolini A.; Benini L.
File in questo prodotto:
File Dimensione Formato  
TIE2020.pdf

Open Access dal 02/05/2020

Tipo: Postprint
Licenza: Licenza per accesso libero gratuito
Dimensione 2.8 MB
Formato Adobe PDF
2.8 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/788553
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 6
social impact