Random forests (RFs) use a collection of decision trees (DTs) to perform the classification or regression. RFs are adopted in a wide variety of machine learning (ML) applications, and they are finding increasing use also in scenarios at the extreme edge of the Internet of Things (TinyML) where memory constraints are particularly tight. This article addresses the optimization of the computational and storage costs for running DTs on the microcontroller units (MCUs) typically deployed in TinyML scenarios. We introduce three alternative DT kernels optimized for memory- and compute-limited MCUs, providing insight into the key memory-latency tradeoffs on an open-source RISC-V platform. We identify key bottlenecks and demonstrate that SW optimizations enable up to significant memory footprint and latency decrease. Experimental results show that the optimized kernels achieve up to 4.5 µs latency, 4.8× speedup, and 45% storage reduction against the widely-adopted naive DT design. We carry out a detailed performance and energy cost analysis of various optimized DT variants: the best approach requires just 8 instructions and 0.155 pJ per decision.

Tabanelli E., Tagliavini G., Benini L. (2022). Optimizing Random Forest Based Inference on RISC-V MCUs at the Extreme Edge. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 41(11), 4516-4526 [10.1109/TCAD.2022.3199903].

Optimizing Random Forest Based Inference on RISC-V MCUs at the Extreme Edge

Tabanelli E.;Tagliavini G.
;
Benini L.
2022

Abstract

Random forests (RFs) use a collection of decision trees (DTs) to perform the classification or regression. RFs are adopted in a wide variety of machine learning (ML) applications, and they are finding increasing use also in scenarios at the extreme edge of the Internet of Things (TinyML) where memory constraints are particularly tight. This article addresses the optimization of the computational and storage costs for running DTs on the microcontroller units (MCUs) typically deployed in TinyML scenarios. We introduce three alternative DT kernels optimized for memory- and compute-limited MCUs, providing insight into the key memory-latency tradeoffs on an open-source RISC-V platform. We identify key bottlenecks and demonstrate that SW optimizations enable up to significant memory footprint and latency decrease. Experimental results show that the optimized kernels achieve up to 4.5 µs latency, 4.8× speedup, and 45% storage reduction against the widely-adopted naive DT design. We carry out a detailed performance and energy cost analysis of various optimized DT variants: the best approach requires just 8 instructions and 0.155 pJ per decision.
2022
Tabanelli E., Tagliavini G., Benini L. (2022). Optimizing Random Forest Based Inference on RISC-V MCUs at the Extreme Edge. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 41(11), 4516-4526 [10.1109/TCAD.2022.3199903].
Tabanelli E.; Tagliavini G.; Benini L.
File in questo prodotto:
File Dimensione Formato  
TCAD_RF_postprint.pdf

accesso aperto

Tipo: Postprint
Licenza: Licenza per accesso libero gratuito
Dimensione 3 MB
Formato Adobe PDF
3 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/899719
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 6
social impact