CRIS Current Research Information System

High energy efficiency and low memory footprint are the key requirements for the deployment of deep learning based analytics on low-power microcontrollers. Here we present work-in-progress results with Q-bit Quantized Neural Networks (QNNs) deployed on a commercial Cortex-M7 class microcontroller by means of an extension to the ARM CMSIS-NN library. We show that i) for Q=4 and Q=2 low memory footprint QNNs can be deployed with an energy overhead of 30% and 36% respectively against the 8-bit CMSIS-NN due to the lack of quantization support in the ISA; ii) for Q=1 native instructions can be used, yielding an energy and latency reduction of ∼3.8× with respect to CMSIS-NN. Our initial results suggest that a small set of QNN-related specialized instructions could improve performance by as much as 7.5× for Q=4, 13.6× for Q=2 and 6.5× for binary NNs.

Rusci, M., Capotondi, A., Conti, F., Benini, L. (2018). Work-in-Progress: Quantized NNs as the Definitive solution for inference on low-power ARM MCUs?. Institute of Electrical and Electronics Engineers Inc. [10.1109/CODESISSS.2018.8525915].

Work-in-Progress: Quantized NNs as the Definitive solution for inference on low-power ARM MCUs?

Rusci, Manuele;Capotondi, Alessandro;Conti, Francesco;Benini, Luca

2018

Abstract

High energy efficiency and low memory footprint are the key requirements for the deployment of deep learning based analytics on low-power microcontrollers. Here we present work-in-progress results with Q-bit Quantized Neural Networks (QNNs) deployed on a commercial Cortex-M7 class microcontroller by means of an extension to the ARM CMSIS-NN library. We show that i) for Q=4 and Q=2 low memory footprint QNNs can be deployed with an energy overhead of 30% and 36% respectively against the 8-bit CMSIS-NN due to the lack of quantization support in the ISA; ii) for Q=1 native instructions can be used, yielding an energy and latency reduction of ∼3.8× with respect to CMSIS-NN. Our initial results suggest that a small set of QNN-related specialized instructions could improve performance by as much as 7.5× for Q=4, 13.6× for Q=2 and 6.5× for binary NNs.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2018
			
	Titolo del volume
	
				2018 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				2
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CODESISSS.2018.8525915
			
	Citazione
	
				Rusci, M., Capotondi, A., Conti, F., Benini, L. (2018). Work-in-Progress: Quantized NNs as the Definitive solution for inference on low-power ARM MCUs?. Institute of Electrical and Electronics Engineers Inc. [10.1109/CODESISSS.2018.8525915].
			
	Tutti gli autori
	
						Rusci, Manuele; Capotondi, Alessandro; Conti, Francesco; Benini, Luca
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Binder2.pdf Open Access dal 10/05/2019 Tipo: Postprint Licenza: Licenza per accesso libero gratuito Dimensione 418.99 kB Formato Adobe PDF Visualizza/Apri	418.99 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/652922

Citazioni

ND

15

1

social impact