Tiny Machine Learning (TinyML) applications impose mu J/Inference constraints, with maximum power consumption of a few tens of mW. It is extremely challenging to meet these requirement at a reasonable accuracy level. In this work, we address this challenge with a flexible, fully digital Ternary Neural Network (TNN) accelerator in a RISC-V-based SoC. The design achieves 2.72 mu J/Inference, 12.2 mW, 3200 Inferences/sec at 0.5 V for a non-trivial 9-layer, 96 channels-per-layer network with CIFAR-10 accuracy of 86 %. The peak energy efficiency is 1036 TOp/s/W, outperforming the state-of-the-art in silicon-proven TinyML accelerators by 1.67x.
A 1036 TOp/s/W, 12.2 mW, 2.72 mu J/Inference All Digital TNN Accelerator in 22 nm FDX Technology for TinyML Applications / Scherer, M; Di Mauro, A; Rutishauser, G; Fischer, T; Benini, L. - ELETTRONICO. - (2022), pp. 1-3. (Intervento presentato al convegno Symposium in Low-Power and High-Speed Chips (COOL CHIPS) tenutosi a Tokyo, Japan nel 20-22 april 2022) [10.1109/COOLCHIPS54332.2022.9772668].
A 1036 TOp/s/W, 12.2 mW, 2.72 mu J/Inference All Digital TNN Accelerator in 22 nm FDX Technology for TinyML Applications
Scherer, M;Benini, L
2022
Abstract
Tiny Machine Learning (TinyML) applications impose mu J/Inference constraints, with maximum power consumption of a few tens of mW. It is extremely challenging to meet these requirement at a reasonable accuracy level. In this work, we address this challenge with a flexible, fully digital Ternary Neural Network (TNN) accelerator in a RISC-V-based SoC. The design achieves 2.72 mu J/Inference, 12.2 mW, 3200 Inferences/sec at 0.5 V for a non-trivial 9-layer, 96 channels-per-layer network with CIFAR-10 accuracy of 86 %. The peak energy efficiency is 1036 TOp/s/W, outperforming the state-of-the-art in silicon-proven TinyML accelerators by 1.67x.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.