A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes

Burrello, A.; Scherer, M.; Zanghieri, M.; Conti, F.; Benini, L.

doi:10.1109/COINS51742.2021.9524173

Transformer networks have become state-of-The-Art for many tasks such as NLP and are closing the gap on other tasks like image recognition. Similarly, Transformers and Attention methods are starting to attract attention on smaller-scale tasks, which fit the typical memory envelope of MCUs. In this work, we propose a new set of execution kernels tuned for efficient execution on MCU-class RISC-V and ARM Cortex-M cores. We focus on minimizing memory movements while maximizing data reuse in the Attention layers. With our library, we obtain 3.4×, 1.8×, and 2.1× lower latency and energy on 8-bit Attention layers, compared to previous state-of-The-Art (SoA) linear and matrix multiplication kernels in the CMSIS-NN and PULP-NN libraries on the STM32H7 (Cortex M7), STM32L4 (Cortex M4), and GAP8 (RISC-V IMC-Xpulp) platforms, respectively. As a use case for our TinyTransformer library, we also demonstrate that we can fit a 263 kB Transformer on the GAP8 platform, outperforming the previous SoA convolutional architecture on the TinyRadarNN dataset, with a latency of 9.24 ms and 0.47 mJ energy consumption and an accuracy improvement of 3.5%.

Burrello A., Scherer M., Zanghieri M., Conti F., Benini L. (2021). A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes. Institute of Electrical and Electronics Engineers Inc. [10.1109/COINS51742.2021.9524173].

A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes

Burrello A.^Primo;Scherer M.;Zanghieri M.;Conti F.^Penultimo;Benini L.^Ultimo

2021

Abstract

Transformer networks have become state-of-The-Art for many tasks such as NLP and are closing the gap on other tasks like image recognition. Similarly, Transformers and Attention methods are starting to attract attention on smaller-scale tasks, which fit the typical memory envelope of MCUs. In this work, we propose a new set of execution kernels tuned for efficient execution on MCU-class RISC-V and ARM Cortex-M cores. We focus on minimizing memory movements while maximizing data reuse in the Attention layers. With our library, we obtain 3.4×, 1.8×, and 2.1× lower latency and energy on 8-bit Attention layers, compared to previous state-of-The-Art (SoA) linear and matrix multiplication kernels in the CMSIS-NN and PULP-NN libraries on the STM32H7 (Cortex M7), STM32L4 (Cortex M4), and GAP8 (RISC-V IMC-Xpulp) platforms, respectively. As a use case for our TinyTransformer library, we also demonstrate that we can fit a 263 kB Transformer on the GAP8 platform, outperforming the previous SoA convolutional architecture on the TinyRadarNN dataset, with a latency of 9.24 ms and 0.47 mJ energy consumption and an accuracy improvement of 3.5%.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo del volume
	
				2021 IEEE International Conference on Omni-Layer Intelligent Systems, COINS 2021
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				6
			
	Codice DOI
	
				https://dx.doi.org/10.1109/COINS51742.2021.9524173
			
	Citazione
	
				Burrello A.,  Scherer M.,  Zanghieri M.,  Conti F.,  Benini L. (2021). A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes. Institute of Electrical and Electronics Engineers Inc. [10.1109/COINS51742.2021.9524173].
			
	Tutti gli autori
	
						Burrello A.; Scherer M.; Zanghieri M.; Conti F.; Benini L.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/847028

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

18

0

CRIS Current Research Information System