In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations. We also remove the floating-point unit from Quarks' lanes and use the CVA6 RISC-V scalar core for the re-scaling operations that are required in quantized neural network inference. This makes each lane of Quark 2 times smaller and 1.9 times more power efficient compared to the ones of Ara. In this paper we show that Quark can run quantized models at sub-byte precision. Notably we show that for 1-bit and 2-bit quantized models, Quark can accelerate computation of Conv2d over various ranges of inputs and kernel sizes.

AskariHemmat, M., Dupuis, T., Fournier, Y., El Zarif, N., Cavalcante, M., Perotti, M., et al. (2023). Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference. 345 E 47TH ST, NEW YORK, NY 10017 USA : IEEE [10.1109/ISCAS46773.2023.10181985].

Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference

Benini, Luca;
2023

Abstract

In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations. We also remove the floating-point unit from Quarks' lanes and use the CVA6 RISC-V scalar core for the re-scaling operations that are required in quantized neural network inference. This makes each lane of Quark 2 times smaller and 1.9 times more power efficient compared to the ones of Ara. In this paper we show that Quark can run quantized models at sub-byte precision. Notably we show that for 1-bit and 2-bit quantized models, Quark can accelerate computation of Conv2d over various ranges of inputs and kernel sizes.
2023
2023 IEEE International Symposium on Circuits and Systems (ISCAS)
.
.
AskariHemmat, M., Dupuis, T., Fournier, Y., El Zarif, N., Cavalcante, M., Perotti, M., et al. (2023). Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference. 345 E 47TH ST, NEW YORK, NY 10017 USA : IEEE [10.1109/ISCAS46773.2023.10181985].
AskariHemmat, MohammadHossein; Dupuis, Théo; Fournier, Yoan; El Zarif, Nizar; Cavalcante, Matheus; Perotti, Matteo; Gürkaynak, Frank; Benini, Luca; Le...espandi
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/958804
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 5
social impact