Many-core architectures structured as fabrics of tightly-coupled clusters have shown promising results on embedded computer vision benchmarks, providing state-of-art performance with a reduced power budget. We propose PULP (Parallel processing Ultra-Low Power platform), an architecture built on clusters of tightly-coupled OpenRISC ISA cores, with advanced techniques for fast performance and energy scalability that exploit the capabilities of the STMicroelectronics UTB FD-SOI 28nm technology. As a use case for PULP, we show that a computationally demanding vision kernel based on Convolutional Neural Networks can be quickly and efficiently switched from a low power, low frame-rate operating point to a high frame-rate one when a detection is performed. Our results show that PULP performance can be scaled over a 1x-354x range, with a peak performance/power efficiency of 211 GOPS/W.
Energy-efficient vision on the PULP platform for ultra-low power parallel computing / Conti, Francesco; Rossi, Davide; Pullini, Antonio; Loi, Igor; Benini, Luca. - STAMPA. - (2014), pp. 6986099.1-6986099.6. (Intervento presentato al convegno 2014 IEEE Workshop on Signal Processing Systems, SiPS 2014 tenutosi a Queen's University Belfast, gbr nel 2014) [10.1109/SiPS.2014.6986099].
Energy-efficient vision on the PULP platform for ultra-low power parallel computing
CONTI, FRANCESCO;ROSSI, DAVIDE;LOI, IGOR;BENINI, LUCA
2014
Abstract
Many-core architectures structured as fabrics of tightly-coupled clusters have shown promising results on embedded computer vision benchmarks, providing state-of-art performance with a reduced power budget. We propose PULP (Parallel processing Ultra-Low Power platform), an architecture built on clusters of tightly-coupled OpenRISC ISA cores, with advanced techniques for fast performance and energy scalability that exploit the capabilities of the STMicroelectronics UTB FD-SOI 28nm technology. As a use case for PULP, we show that a computationally demanding vision kernel based on Convolutional Neural Networks can be quickly and efficiently switched from a low power, low frame-rate operating point to a high frame-rate one when a detection is performed. Our results show that PULP performance can be scaled over a 1x-354x range, with a peak performance/power efficiency of 211 GOPS/W.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.