CRIS Current Research Information System

Deep Neural Networks (DNNs) computation-hungry algorithms demand hardware platforms capable of meeting rigid power and timing requirements. We introduce the Serial-MAC-engine (SMAC-engine), a fully-digital hardware accelerator for inference of quantized DNNs suitable for integration in a heterogeneous System-on-Chip (SoC). The accelerator is completely embedded in the form of a Hardware Processing Engine (HWPE) in the PULPissimo platform, a RISCV-based programmable architecture that targets the computational requirements of IoT applications. The SMAC-engine supports configurable precision for both weights (8/6/4 bits) and activations (8/4 bits), with scalable performance. Results in 65 nm technology demonstrate that the serial-MAC approach enables the accelerator to achieve a maximum throughput of 14.28 GMAC/s, consuming 0.58 pJ/[email protected] V when operating at a precision of 4 bits for weights and 8 bits for activations.

Capra M., Conti F., Martina M. (2021). A Multi-Precision Bit-Serial Hardware Accelerator IP for Deep Learning Enabled Internet-of-Things. Institute of Electrical and Electronics Engineers Inc. [10.1109/MWSCAS47672.2021.9531722].

A Multi-Precision Bit-Serial Hardware Accelerator IP for Deep Learning Enabled Internet-of-Things

Capra M.;Conti F.;Martina M.

2021

Abstract

Deep Neural Networks (DNNs) computation-hungry algorithms demand hardware platforms capable of meeting rigid power and timing requirements. We introduce the Serial-MAC-engine (SMAC-engine), a fully-digital hardware accelerator for inference of quantized DNNs suitable for integration in a heterogeneous System-on-Chip (SoC). The accelerator is completely embedded in the form of a Hardware Processing Engine (HWPE) in the PULPissimo platform, a RISCV-based programmable architecture that targets the computational requirements of IoT applications. The SMAC-engine supports configurable precision for both weights (8/6/4 bits) and activations (8/4 bits), with scalable performance. Results in 65 nm technology demonstrate that the serial-MAC approach enables the accelerator to achieve a maximum throughput of 14.28 GMAC/s, consuming 0.58 pJ/[email protected] V when operating at a precision of 4 bits for weights and 8 bits for activations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo del volume
	
				Midwest Symposium on Circuits and Systems
			
	Pagina iniziale
	
				192
			
	Pagina finale
	
				197
			
	Collana/Serie
	
				THE ... MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS CONFERENCE PROCEEDINGS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/MWSCAS47672.2021.9531722
			
	Citazione
	
				Capra M.,  Conti F.,  Martina M. (2021). A Multi-Precision Bit-Serial Hardware Accelerator IP for Deep Learning Enabled Internet-of-Things. Institute of Electrical and Electronics Engineers Inc. [10.1109/MWSCAS47672.2021.9531722].
			
	Tutti gli autori
	
						Capra M.; Conti F.; Martina M.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/847025

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

5

5

social impact