Computer vision (CV) based on Convolutional Neural Networks (CNN) is a rapidly developing field thanks to CNN's flexibility, strong generalization capability and classification accuracy (matching and sometimes exceeding human performance). CNN-based classifiers are typically deployed on servers or high-end embedded platforms. However, their ability to compress low information density data such as images into highly informative classification tags makes them extremely interesting for wearable and IoT scenarios, should it be possible to fit their computational requirements within deeply embedded devices such as visual sensor nodes. We propose a 65nm system-on-chip implementing a hybrid HW/SW CNN accelerator while meeting this energy efficiency target. The SoC integrates a near-threshold parallel processor cluster [1] and a hardware accelerator for convolution-accumulation operations [2], which constitute the basic kernel of CNNs: it achieves peak performance of 11.2 GMAC/s@ 1.2 V and peak energy efficiency of 261 GMAC/s/W@ 0.65V.
Pullini, A., Conti, F., Rossi, D., Loi, I., Gautschi, M., Benini, L. (2016). A heterogeneous multi-core system-on-chip for energy efficient brain inspired vision. Institute of Electrical and Electronics Engineers Inc. [10.1109/ISCAS.2016.7539213].
A heterogeneous multi-core system-on-chip for energy efficient brain inspired vision
CONTI, FRANCESCO;ROSSI, DAVIDE;LOI, IGOR;BENINI, LUCA
2016
Abstract
Computer vision (CV) based on Convolutional Neural Networks (CNN) is a rapidly developing field thanks to CNN's flexibility, strong generalization capability and classification accuracy (matching and sometimes exceeding human performance). CNN-based classifiers are typically deployed on servers or high-end embedded platforms. However, their ability to compress low information density data such as images into highly informative classification tags makes them extremely interesting for wearable and IoT scenarios, should it be possible to fit their computational requirements within deeply embedded devices such as visual sensor nodes. We propose a 65nm system-on-chip implementing a hybrid HW/SW CNN accelerator while meeting this energy efficiency target. The SoC integrates a near-threshold parallel processor cluster [1] and a hardware accelerator for convolution-accumulation operations [2], which constitute the basic kernel of CNNs: it achieves peak performance of 11.2 GMAC/s@ 1.2 V and peak energy efficiency of 261 GMAC/s/W@ 0.65V.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.