CRIS Current Research Information System

Computationally intensive algorithms such as Deep Neural Networks (DNNs) are becoming killer applications for edge devices. Porting heavily data-parallel algorithms on resource-constrained and battery-powered devices while retaining the flexibility granted by instruction processor-based architectures poses several challenges related to memory footprint, computational throughput, and energy efficiency. Low-bitwidth and mixed-precision arithmetic have been proven to be valid strategies for tackling these problems. We present Dustin, a fully programmable compute cluster integrating 16 RISC-V cores capable of 2-to 32-bit arithmetic and all possible mixed-precision combinations. In addition to a conventional Multiple-Instruction Multiple-Data (MIMD) processing paradigm, Dustin introduces a Vector Lockstep Execution Mode (VLEM) to minimize power consumption in highly data-parallel kernels. In VLEM, a single leader core fetches instructions and broadcasts them to the 15 follower cores. Clock gating Instruction Fetch (IF) stages and private caches of the follower cores leads to 38% power reduction. The cluster, implemented in 65 nm CMOS technology, achieves a peak performance of 58 GOPS and a peak efficiency of 1.15 TOPS/W.

Gianmarco Ottavi, Angelo Garofalo, Giuseppe Tagliavini, Francesco Conti, Alfio Di Mauro, Luca Benini, et al. (2023). Dustin: A 16-Cores Parallel Ultra-Low-Power Cluster With 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS. I, REGULAR PAPERS, 70(6), 2450-2463 [10.1109/tcsi.2023.3254810].

Dustin: A 16-Cores Parallel Ultra-Low-Power Cluster With 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode

Gianmarco Ottavi;Angelo Garofalo;Giuseppe Tagliavini;Francesco Conti;Alfio Di Mauro;Luca Benini;Davide Rossi

2023

Abstract

Computationally intensive algorithms such as Deep Neural Networks (DNNs) are becoming killer applications for edge devices. Porting heavily data-parallel algorithms on resource-constrained and battery-powered devices while retaining the flexibility granted by instruction processor-based architectures poses several challenges related to memory footprint, computational throughput, and energy efficiency. Low-bitwidth and mixed-precision arithmetic have been proven to be valid strategies for tackling these problems. We present Dustin, a fully programmable compute cluster integrating 16 RISC-V cores capable of 2-to 32-bit arithmetic and all possible mixed-precision combinations. In addition to a conventional Multiple-Instruction Multiple-Data (MIMD) processing paradigm, Dustin introduces a Vector Lockstep Execution Mode (VLEM) to minimize power consumption in highly data-parallel kernels. In VLEM, a single leader core fetches instructions and broadcasts them to the 15 follower cores. Clock gating Instruction Fetch (IF) stages and private caches of the follower cores leads to 38% power reduction. The cluster, implemented in 65 nm CMOS technology, achieves a peak performance of 58 GOPS and a peak efficiency of 1.15 TOPS/W.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Rivista
	
				IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS. I, REGULAR PAPERS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/tcsi.2023.3254810
			
	Citazione
	
				Gianmarco Ottavi,  Angelo Garofalo,  Giuseppe Tagliavini,  Francesco Conti,  Alfio Di Mauro,  Luca Benini, et al. (2023). Dustin: A 16-Cores Parallel Ultra-Low-Power Cluster With 2b-to-32b Fully Flexible Bit-Precision and Vector Lockstep Execution Mode. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS. I, REGULAR PAPERS, 70(6), 2450-2463 [10.1109/tcsi.2023.3254810].
			
	Tutti gli autori
	
						Gianmarco Ottavi; Angelo Garofalo; Giuseppe Tagliavini; Francesco Conti; Alfio Di Mauro; Luca Benini; Davide Rossi
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2201.08656.pdf accesso aperto Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 894.54 kB Formato Adobe PDF Visualizza/Apri	894.54 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/922406

Citazioni

ND

26

20

ND

social impact