Low bit-width Quantized Neural Networks (QNNs) enable deployment of complex machine learning models on constrained devices such as microcontrollers (MCUs) by reducing their memory footprint. Fine-grained asymmetric quantization (i.e., different bit-widths assigned to weights and activations on a tensor-by-tensor basis) is a particularly interesting scheme to maximize accuracy under a tight memory constraint. However, the lack of sub-byte instruction set architecture (ISA) support in SoA microprocessors makes it hard to fully exploit this extreme quantization paradigm in embedded MCUs. Support for sub-byte and asymmetric QNNs would require many precision formats and an exorbitant amount of opcode space. In this work, we attack this problem with status-based SIMD instructions: rather than encoding precision explicitly, each operand's precision is set dynamically in a core status register. We propose a novel RISC-V ISA core MPIC (Mixed Precision Inference Core) based on the open-source RI5CY core. Our approach enables full support for mixed-precision QNN inference with 292 different combinations of operands at 16-, 8-, 4-and 2-bit precision, without adding any extra opcode or increasing the complexity of the decode stage. Our results show that MPIC improves both performance and energy efficiency by a factor of 1.1-4.9x when compared to software-based mixed-precision on RI5CY; with respect to commercially available Cortex-M4 and M7 microcontrollers, it delivers 3.6-11.7x better performance and 41-155x higher efficiency.

Ottavi G., Garofalo A., Tagliavini G., Conti F., Benini L., Rossi D. (2020). A mixed-precision RISC-V processor for extreme-edge DNN inference. IEEE Computer Society [10.1109/ISVLSI49217.2020.000-5].

A mixed-precision RISC-V processor for extreme-edge DNN inference

Ottavi G.;Garofalo A.;Tagliavini G.;Conti F.;Benini L.;Rossi D.
2020

Abstract

Low bit-width Quantized Neural Networks (QNNs) enable deployment of complex machine learning models on constrained devices such as microcontrollers (MCUs) by reducing their memory footprint. Fine-grained asymmetric quantization (i.e., different bit-widths assigned to weights and activations on a tensor-by-tensor basis) is a particularly interesting scheme to maximize accuracy under a tight memory constraint. However, the lack of sub-byte instruction set architecture (ISA) support in SoA microprocessors makes it hard to fully exploit this extreme quantization paradigm in embedded MCUs. Support for sub-byte and asymmetric QNNs would require many precision formats and an exorbitant amount of opcode space. In this work, we attack this problem with status-based SIMD instructions: rather than encoding precision explicitly, each operand's precision is set dynamically in a core status register. We propose a novel RISC-V ISA core MPIC (Mixed Precision Inference Core) based on the open-source RI5CY core. Our approach enables full support for mixed-precision QNN inference with 292 different combinations of operands at 16-, 8-, 4-and 2-bit precision, without adding any extra opcode or increasing the complexity of the decode stage. Our results show that MPIC improves both performance and energy efficiency by a factor of 1.1-4.9x when compared to software-based mixed-precision on RI5CY; with respect to commercially available Cortex-M4 and M7 microcontrollers, it delivers 3.6-11.7x better performance and 41-155x higher efficiency.
2020
Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI
512
517
Ottavi G., Garofalo A., Tagliavini G., Conti F., Benini L., Rossi D. (2020). A mixed-precision RISC-V processor for extreme-edge DNN inference. IEEE Computer Society [10.1109/ISVLSI49217.2020.000-5].
Ottavi G.; Garofalo A.; Tagliavini G.; Conti F.; Benini L.; Rossi D.
File in questo prodotto:
File Dimensione Formato  
ISVLSI_STATUS_BASED.pdf

Open Access dal 05/02/2021

Tipo: Postprint
Licenza: Licenza per accesso libero gratuito
Dimensione 776.08 kB
Formato Adobe PDF
776.08 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/776845
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 24
  • ???jsp.display-item.citation.isi??? 22
social impact