Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-The-Art networks are extremely compute-And memory-intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: A BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, and capable of handling high-resolution images by virtue of its systolic-scalable architecture. We achieve a 5.9 TOp/s/W system-level efficiency (i.e. including I/Os)-2.2x higher than state-of-The-Art BNN accelerators, even if our core uses resource-intensive FP16 arithmetic for increased robustness.

Hyperdrive: A systolically scalable binary-weight CNN Inference Engine for mW IoT End-Nodes / Andri, Renzo; Cavigelli, Lukas; Rossi, Davide; Benini, Luca. - ELETTRONICO. - 2018-:(2018), pp. 8429420.509-8429420.515. (Intervento presentato al convegno 17th IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2018 tenutosi a hkg nel 2018) [10.1109/ISVLSI.2018.00099].

Hyperdrive: A systolically scalable binary-weight CNN Inference Engine for mW IoT End-Nodes

Rossi, Davide;Benini, Luca
2018

Abstract

Deep neural networks have achieved impressive results in computer vision and machine learning. Unfortunately, state-of-The-Art networks are extremely compute-And memory-intensive which makes them unsuitable for mW-devices such as IoT end-nodes. Aggressive quantization of these networks dramatically reduces the computation and memory footprint. Binary-weight neural networks (BWNs) follow this trend, pushing weight quantization to the limit. Hardware accelerators for BWNs presented up to now have focused on core efficiency, disregarding I/O bandwidth and system-level efficiency that are crucial for deployment of accelerators in ultra-low power devices. We present Hyperdrive: A BWN accelerator dramatically reducing the I/O bandwidth exploiting a novel binary-weight streaming approach, and capable of handling high-resolution images by virtue of its systolic-scalable architecture. We achieve a 5.9 TOp/s/W system-level efficiency (i.e. including I/Os)-2.2x higher than state-of-The-Art BNN accelerators, even if our core uses resource-intensive FP16 arithmetic for increased robustness.
2018
Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI
509
515
Hyperdrive: A systolically scalable binary-weight CNN Inference Engine for mW IoT End-Nodes / Andri, Renzo; Cavigelli, Lukas; Rossi, Davide; Benini, Luca. - ELETTRONICO. - 2018-:(2018), pp. 8429420.509-8429420.515. (Intervento presentato al convegno 17th IEEE Computer Society Annual Symposium on VLSI, ISVLSI 2018 tenutosi a hkg nel 2018) [10.1109/ISVLSI.2018.00099].
Andri, Renzo; Cavigelli, Lukas; Rossi, Davide; Benini, Luca
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/653394
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 10
social impact