We present Thestral, a 10-core RISC-V chip for energy-proportional parallel computing manufactured in 22 nm FD-SOI technology. Thestral contains a control core and a nine-core compute cluster. Each core features a single-precision floating-point unit (FPU) and an integer processing unit (IPU) and implements custom instruction set architecture (ISA) extensions to improve utilization. The chip features 20 fine-grain power domains: one for each FPU and IPU, as well as one for the entire acceleration cluster. Such aggressive power management granularity is valuable both for extreme-edge computing, where power gating reduces sleep power, and for high-performance computing, where leakage control is required to meet thermal design power constraints and to minimize idle power. We propose a fast and fine-grain power gating architecture with much finer granularity than the state of the art for multi-core computing platforms. A sub-10 ns power-up sequence allows for fine-tuning the compute cluster configuration, powering up only the computational units required for a specific application phase. Our solution enables up to 42% measured power savings for the extreme-edge scenario during sleep mode (@350 MHz, 0.6 V, 25 °C), which is 12.7% more than what can be achieved with aggressive clock-gating. On the other extreme, in an HPC setting, a Thestral-based many-core system running memory-bound applications (@850 MHz, 0.9 V, 75 °C) can save up to 41% power.
Benz T., Bertaccini L., Zaruba F., Schuiki F., Gurkaynak F.K., Benini L. (2021). A 10-core SoC with 20 Fine-Grain Power Domains for Energy-Proportional Data-Parallel Processing over a Wide Voltage and Temperature Range. Institute of Electrical and Electronics Engineers Inc. [10.1109/ESSCIRC53450.2021.9567755].
A 10-core SoC with 20 Fine-Grain Power Domains for Energy-Proportional Data-Parallel Processing over a Wide Voltage and Temperature Range
Benini L.
2021
Abstract
We present Thestral, a 10-core RISC-V chip for energy-proportional parallel computing manufactured in 22 nm FD-SOI technology. Thestral contains a control core and a nine-core compute cluster. Each core features a single-precision floating-point unit (FPU) and an integer processing unit (IPU) and implements custom instruction set architecture (ISA) extensions to improve utilization. The chip features 20 fine-grain power domains: one for each FPU and IPU, as well as one for the entire acceleration cluster. Such aggressive power management granularity is valuable both for extreme-edge computing, where power gating reduces sleep power, and for high-performance computing, where leakage control is required to meet thermal design power constraints and to minimize idle power. We propose a fast and fine-grain power gating architecture with much finer granularity than the state of the art for multi-core computing platforms. A sub-10 ns power-up sequence allows for fine-tuning the compute cluster configuration, powering up only the computational units required for a specific application phase. Our solution enables up to 42% measured power savings for the extreme-edge scenario during sleep mode (@350 MHz, 0.6 V, 25 °C), which is 12.7% more than what can be achieved with aggressive clock-gating. On the other extreme, in an HPC setting, a Thestral-based many-core system running memory-bound applications (@850 MHz, 0.9 V, 75 °C) can save up to 41% power.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.