CRIS Current Research Information System

Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive, which makes them unsuitable for mW-devices such as loT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth, and system-level efficiency that are crucial for the deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can he used for an arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)-3.1 x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness.

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine / Andri R.; Cavigelli L.; Rossi D.; Benini L.. - In: IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS. - ISSN 2156-3357. - STAMPA. - 9:2(2019), pp. 8668446.309-8668446.322. [10.1109/JETCAS.2019.2905654]

Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine

Andri R.;Cavigelli L.;Rossi D.;Benini L.

2019

Abstract

Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-the-art networks are extremely compute and memory intensive, which makes them unsuitable for mW-devices such as loT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth, and system-level efficiency that are crucial for the deployment of accelerators in ultra-low power devices. We present Hyperdrive: a BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, which can he used for an arbitrarily sized convolutional neural network architecture and input resolution by exploiting the natural scalability of the compute units both at chip-level and system-level by arranging Hyperdrive chips systolically in a 2D mesh while processing the entire feature map together in parallel. Hyperdrive achieves 4.3 TOp/s/W system-level efficiency (i.e., including I/Os)-3.1 x higher than state-of-the-art BWN accelerators, even if its core uses resource-intensive FP16 arithmetic for increased robustness.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2019
		
	Rivista
	
			IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS
		
	Codice DOI
	
			https://dx.doi.org/10.1109/JETCAS.2019.2905654
		
	Citazione
	
			Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine / Andri R.; Cavigelli L.; Rossi D.; Benini L.. - In: IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS. - ISSN 2156-3357. - STAMPA. - 9:2(2019), pp. 8668446.309-8668446.322. [10.1109/JETCAS.2019.2905654]
		
	Tutti gli autori
	
			Andri R.; Cavigelli L.; Rossi D.; Benini L.
		
	Appare nelle tipologie:
	
			1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
HYPERDRIVE.pdf accesso aperto Tipo: Postprint Licenza: Licenza per accesso libero gratuito Dimensione 5.93 MB Formato Adobe PDF Visualizza/Apri	5.93 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/703308

Citazioni

ND

15

16

social impact