Strongly quantized fixed-point arithmetic is considered the key direction to enable the inference of CNNs on low-power, resource-constrained edge devices. However, the deployment of highly quantized Neural Networks at the extreme edge of IoT, on fully programmable MCUs, is currently limited by the lack of support, at the Instruction Set Architecture (ISA) level, for sub-byte fixed-point data types, making it necessary to add numerous instructions for packing and unpacking data when running low-bitwidth (i.e. 2- and 4-bit) QNN kernels, creating a bottleneck for performance and energy efficiency of QNN inference. In this work we present a set of extensions to the RISC-V ISA, aimed at boosting the energy efficiency of low-bitwidth QNNs on low-power microcontroller-class cores. The microarchitecture supporting the new extensions is built on top of a RISC-V core featuring instruction set extensions targeting energy-efficient digital signal processing. To evaluate the extensions, we integrated the core into a full microcontroller system, synthesized and placed&routed in 22nm FDX technology. QNN convolution kernels, implemented on the new core, run 5.3× and 8.9× faster when considering 4- and 2-bit data operands respectively, compared to the baseline processor only supporting 8-bit SIMD instructions. With a peak of 279 GMAC/s/W, the proposed solution achieves 9× better energy efficiency compared to the baseline and two orders of magnitudes better energy efficiency compared to state-of-the-art microcontrollers.
Garofalo, A., Tagliavini, G., Conti, F., Rossi, D., Benini, L. (2020). XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors Through ISA Extensions. Institute of Electrical and Electronics Engineers Inc. (IEEE) [10.23919/DATE48585.2020.9116529].
XpulpNN: Accelerating Quantized Neural Networks on RISC-V Processors Through ISA Extensions
Garofalo, Angelo;Tagliavini, Giuseppe;Conti, Francesco;Rossi, Davide;Benini, Luca
2020
Abstract
Strongly quantized fixed-point arithmetic is considered the key direction to enable the inference of CNNs on low-power, resource-constrained edge devices. However, the deployment of highly quantized Neural Networks at the extreme edge of IoT, on fully programmable MCUs, is currently limited by the lack of support, at the Instruction Set Architecture (ISA) level, for sub-byte fixed-point data types, making it necessary to add numerous instructions for packing and unpacking data when running low-bitwidth (i.e. 2- and 4-bit) QNN kernels, creating a bottleneck for performance and energy efficiency of QNN inference. In this work we present a set of extensions to the RISC-V ISA, aimed at boosting the energy efficiency of low-bitwidth QNNs on low-power microcontroller-class cores. The microarchitecture supporting the new extensions is built on top of a RISC-V core featuring instruction set extensions targeting energy-efficient digital signal processing. To evaluate the extensions, we integrated the core into a full microcontroller system, synthesized and placed&routed in 22nm FDX technology. QNN convolution kernels, implemented on the new core, run 5.3× and 8.9× faster when considering 4- and 2-bit data operands respectively, compared to the baseline processor only supporting 8-bit SIMD instructions. With a peak of 279 GMAC/s/W, the proposed solution achieves 9× better energy efficiency compared to the baseline and two orders of magnitudes better energy efficiency compared to state-of-the-art microcontrollers.File | Dimensione | Formato | |
---|---|---|---|
XpulpNN_DATE_2020 (3).pdf
accesso aperto
Tipo:
Postprint
Licenza:
Licenza per accesso libero gratuito
Dimensione
675.71 kB
Formato
Adobe PDF
|
675.71 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.