XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference

Conti, Francesco; Schiavone, Pasquale Davide; Benini, Luca

doi:10.1109/TCAD.2018.2857019

Binary neural networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy. In this paper, we introduce the XNOR neural engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM/standard cell memory. The XNE is able to fully compute convolutional and dense layers in autonomy or in cooperation with the core in the MCU to realize more complex behaviors. We show post-synthesis results in 65- A nd 22-nm technology for the XNE IP and post-layout results in 22 nm for the full MCU indicating that this system can drop the energy cost per binary operation to 21.6 fJ per operation at 0.4 V, and at the same time is flexible and performant enough to execute state-of-the-art BNN topologies such as ResNet-34 in less than 2.2 mJ per frame at 8.9 frames/s.

Conti, F., Schiavone, P.D., Benini, L. (2018). XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 37(11), 2940-2951 [10.1109/TCAD.2018.2857019].

XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference

Conti, Francesco;Schiavone, Pasquale Davide;Benini, Luca

2018

Abstract

Binary neural networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy. In this paper, we introduce the XNOR neural engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM/standard cell memory. The XNE is able to fully compute convolutional and dense layers in autonomy or in cooperation with the core in the MCU to realize more complex behaviors. We show post-synthesis results in 65- A nd 22-nm technology for the XNE IP and post-layout results in 22 nm for the full MCU indicating that this system can drop the energy cost per binary operation to 21.6 fJ per operation at 0.4 V, and at the same time is flexible and performant enough to execute state-of-the-art BNN topologies such as ResNet-34 in less than 2.2 mJ per frame at 8.9 frames/s.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2018
			
	Rivista
	
				IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TCAD.2018.2857019
			
	Citazione
	
				Conti, F., Schiavone, P.D., Benini, L. (2018). XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 37(11), 2940-2951 [10.1109/TCAD.2018.2857019].
			
	Tutti gli autori
	
						Conti, Francesco; Schiavone, Pasquale Davide; Benini, Luca
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/652918

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

86

56

CRIS Current Research Information System