Efficient Pipelined Execution of CNNs Based on In-Memory Computing and Graph Homomorphism Verification

Dazzi, M.; Sebastian, A.; Parnell, T.; Francese, P. A.; Benini, L.; Eleftheriou, E.

doi:10.1109/TC.2021.3073255

In-memory computing is an emerging computing paradigm enabling deep-learning inference at significantly higher energy-efficiency and reduced latency. The essential idea is mapping the synaptic weights of each layer to one or more in-memory computing (IMC) cores. During inference, these cores perform the associated matrix-vector multiplications in place with O(1) time complexity, obviating the need to move the synaptic weights to additional processing units. Moreover, this architecture enables the execution of these networks in a highly pipelined fashion. However, a key challenge is designing an efficient communication fabric for the IMC cores. In this work, we present one such communication fabric based on a graph topology that is well-suited for the widely successful convolutional neural networks (CNNs). We show that this communication fabric facilitates the pipelined execution of all state-of-the-art CNNs by proving the existence of a homomorphism between the graph representations of these networks and that corresponding to the proposed communication fabric. We then present a quantitative comparison with established communication topologies and show that our proposed topology achieves the lowest bandwidth requirements per communication channel. Finally, we present one hardware implementation and show a concrete example of mapping ResNet-32 onto an IMC core array interconnected via the proposed communication fabric.

Dazzi M., Sebastian A., Parnell T., Francese P.A., Benini L., Eleftheriou E. (2021). Efficient Pipelined Execution of CNNs Based on In-Memory Computing and Graph Homomorphism Verification. IEEE TRANSACTIONS ON COMPUTERS, 70(6), 922-935 [10.1109/TC.2021.3073255].

Efficient Pipelined Execution of CNNs Based on In-Memory Computing and Graph Homomorphism Verification

Dazzi M.;Sebastian A.;Parnell T.;Francese P. A.;Benini L.;Eleftheriou E.

2021

Abstract

In-memory computing is an emerging computing paradigm enabling deep-learning inference at significantly higher energy-efficiency and reduced latency. The essential idea is mapping the synaptic weights of each layer to one or more in-memory computing (IMC) cores. During inference, these cores perform the associated matrix-vector multiplications in place with O(1) time complexity, obviating the need to move the synaptic weights to additional processing units. Moreover, this architecture enables the execution of these networks in a highly pipelined fashion. However, a key challenge is designing an efficient communication fabric for the IMC cores. In this work, we present one such communication fabric based on a graph topology that is well-suited for the widely successful convolutional neural networks (CNNs). We show that this communication fabric facilitates the pipelined execution of all state-of-the-art CNNs by proving the existence of a homomorphism between the graph representations of these networks and that corresponding to the proposed communication fabric. We then present a quantitative comparison with established communication topologies and show that our proposed topology achieves the lowest bandwidth requirements per communication channel. Finally, we present one hardware implementation and show a concrete example of mapping ResNet-32 onto an IMC core array interconnected via the proposed communication fabric.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Rivista
	
				IEEE TRANSACTIONS ON COMPUTERS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TC.2021.3073255
			
	Citazione
	
				Dazzi M.,  Sebastian A.,  Parnell T.,  Francese P.A.,  Benini L.,  Eleftheriou E. (2021). Efficient Pipelined Execution of CNNs Based on In-Memory Computing and Graph Homomorphism Verification. IEEE TRANSACTIONS ON COMPUTERS, 70(6), 922-935 [10.1109/TC.2021.3073255].
			
	Tutti gli autori
	
						Dazzi M.; Sebastian A.; Parnell T.; Francese P.A.; Benini L.; Eleftheriou E.
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/860012

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

10

10

CRIS Current Research Information System