Summary form only given. Deep convolutional neural networks are being regarded today as an extremely effective and flexible approach for extracting actionable, high-level information from the wealth of raw data produced by a wide variety of sensory data sources. CNNs are however computationally demanding: today they typically run on GPU-accelerated compute servers or high-end embedded platforms. Industry and academia are racing to bring CNN inference (first) and training (next) within ever tighter power envelopes, while at the same time meeting real-time requirements. Recent results, including our PULP and ORIGAMI chips, demonstrate there is plenty of room at the bottom: pj/OP (GOPS/mW) computational efficiency, needed for deploying CNNs in the mobile/wearable scenario, is within reach. However, this is not enough: 1000x energy efficiency improvement, within a mW power envelope and with low-cost CMOS processes, is required for deploying CNNs in the most demanding CPS scenarios. The fj/OP milestone will require heterogeneous (3D) integration with ultra-efficient die-to-die communication, mixed-signal pre-processing, event-based approximate computing, while still meeting real-time requirements.
Benini, L. (2017). Plenty of room at the bottom? Micropower deep learning for cognitive cyber physical systems [10.1109/IWASI.2017.7974239].
Plenty of room at the bottom? Micropower deep learning for cognitive cyber physical systems
Benini, Luca
2017
Abstract
Summary form only given. Deep convolutional neural networks are being regarded today as an extremely effective and flexible approach for extracting actionable, high-level information from the wealth of raw data produced by a wide variety of sensory data sources. CNNs are however computationally demanding: today they typically run on GPU-accelerated compute servers or high-end embedded platforms. Industry and academia are racing to bring CNN inference (first) and training (next) within ever tighter power envelopes, while at the same time meeting real-time requirements. Recent results, including our PULP and ORIGAMI chips, demonstrate there is plenty of room at the bottom: pj/OP (GOPS/mW) computational efficiency, needed for deploying CNNs in the mobile/wearable scenario, is within reach. However, this is not enough: 1000x energy efficiency improvement, within a mW power envelope and with low-cost CMOS processes, is required for deploying CNNs in the most demanding CPS scenarios. The fj/OP milestone will require heterogeneous (3D) integration with ultra-efficient die-to-die communication, mixed-signal pre-processing, event-based approximate computing, while still meeting real-time requirements.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.