Datacenters are at the heart of the AI, Industry 4.0 and cloud revolution. A datacenter contains a large number of computing nodes hosted in a large temperature-controlled room. Due to the increasing total power and power density of computing nodes, the overall datacenter compute capacity is often capped by peak power consumption and temperature bottlenecks. To preserve the homogeneous performance assumption between all the nodes, complex cooling solution are required, but they might not be sufficient. In this work, we analysed and characterised the thermal properties of a Tier0 datacenter deploying advanced hybrid cooling technologies: specifically, we studied the spatial and temporal heterogeneity during production and cooling emergency hazards. This paper gives first quantitative evidence of thermal bottlenecks in real-life production workload, showing the presence of significant spatial thermal heterogeneity which could be exploited by thermal-aware job scheduling and datacenter-room run-time workload adaptation and distribution.
Seyedkazemi Ardebili M., Cavazzoni C., Benini L., Bartolini A. (2021). Thermal Characterization of a Tier0 Datacenter Room in Normal and Thermal Emergency Conditions. Springer Science and Business Media Deutschland GmbH [10.1007/978-3-030-67077-1_1].
Thermal Characterization of a Tier0 Datacenter Room in Normal and Thermal Emergency Conditions
Seyedkazemi Ardebili M.
;Benini L.
;Bartolini A.
2021
Abstract
Datacenters are at the heart of the AI, Industry 4.0 and cloud revolution. A datacenter contains a large number of computing nodes hosted in a large temperature-controlled room. Due to the increasing total power and power density of computing nodes, the overall datacenter compute capacity is often capped by peak power consumption and temperature bottlenecks. To preserve the homogeneous performance assumption between all the nodes, complex cooling solution are required, but they might not be sufficient. In this work, we analysed and characterised the thermal properties of a Tier0 datacenter deploying advanced hybrid cooling technologies: specifically, we studied the spatial and temporal heterogeneity during production and cooling emergency hazards. This paper gives first quantitative evidence of thermal bottlenecks in real-life production workload, showing the presence of significant spatial thermal heterogeneity which could be exploited by thermal-aware job scheduling and datacenter-room run-time workload adaptation and distribution.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.