Optimal deployment of deep neural networks (DNNs) on state-of-the-art Systems-on-Chips (SoCs) is crucial for tiny machine learning (TinyML) at the edge. The complexity of these SoCs makes deployment non-trivial, as they typically contain multiple heterogeneous compute cores with limited, programmer-managed memory to optimize latency and energy efficiency. We propose HTVM - a compiler that merges TVM with DORY to maximize the utilization of heterogeneous accelerators and minimize data movements. HTVM allows deploying the MLPerfT Tiny suite on DIANA, an SoC with a RISC-V CPU, and digital and analog compute-in-memory AI accelerators, at 120x improved performance over plain TVM deployment.
HTVM: Efficient Neural Network Deployment On Heterogeneous TinyML Platforms / Van Delm, J; Vandersteegenl, M; Burrello, A; Sarda, GM; Conti, F; Pagliari, DJ; Benini, L; Verhelst, M. - ELETTRONICO. - (2023), pp. 1-6. (Intervento presentato al convegno 2023 60th ACM/IEEE Design Automation Conference (DAC) tenutosi a San Francisco, USA nel 2023) [10.1109/DAC56929.2023.10247664].
HTVM: Efficient Neural Network Deployment On Heterogeneous TinyML Platforms
Burrello, A;Conti, F;Benini, L;
2023
Abstract
Optimal deployment of deep neural networks (DNNs) on state-of-the-art Systems-on-Chips (SoCs) is crucial for tiny machine learning (TinyML) at the edge. The complexity of these SoCs makes deployment non-trivial, as they typically contain multiple heterogeneous compute cores with limited, programmer-managed memory to optimize latency and energy efficiency. We propose HTVM - a compiler that merges TVM with DORY to maximize the utilization of heterogeneous accelerators and minimize data movements. HTVM allows deploying the MLPerfT Tiny suite on DIANA, an SoC with a RISC-V CPU, and digital and analog compute-in-memory AI accelerators, at 120x improved performance over plain TVM deployment.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.