Embodied AI requires pushing complex multi-modal models to the extreme edge for time-constrained tasks such as autonomous navigation of robots and vehicles. On small form-factor devices, e.g., nano-UAVs, such challenges are exacerbated by stringent constraints on energy efficiency and weight. In this paper, we explore embodied multi-modal AI-based perception for Nano-UAVs with the Kraken shield, a 7g multi-sensor (frame-based and event-based imagers) board based on Kraken, a 22 nm SoC featuring multiple acceleration engines for multi-modal event and frame-based inference based on spiking (SNN) and ternary (TNN) neural networks, respectively. Kraken can execute SNN real-time inference for depth estimation at 1.02 k inf/s, 18μJ/inf, TNN real-time inference for object classification at 10 k inf/s, 6μJ/inf, and real-time inference for obstacle avoidance at 221 frame/s, 750 μJ/inf.
Potocnik, V., Di Mauro, A., Lamberti, L., Kartsch, V., Scherer, M., Conti, F., et al. (2024). Circuits and Systems for Embodied AI: Exploring uJ Multi-Modal Perception for Nano-UAVs on the Kraken Shield. IEEE Computer Society [10.1109/ESSERC62670.2024.10719476].
Circuits and Systems for Embodied AI: Exploring uJ Multi-Modal Perception for Nano-UAVs on the Kraken Shield
Di Mauro A.;Scherer M.;Benini L.
2024
Abstract
Embodied AI requires pushing complex multi-modal models to the extreme edge for time-constrained tasks such as autonomous navigation of robots and vehicles. On small form-factor devices, e.g., nano-UAVs, such challenges are exacerbated by stringent constraints on energy efficiency and weight. In this paper, we explore embodied multi-modal AI-based perception for Nano-UAVs with the Kraken shield, a 7g multi-sensor (frame-based and event-based imagers) board based on Kraken, a 22 nm SoC featuring multiple acceleration engines for multi-modal event and frame-based inference based on spiking (SNN) and ternary (TNN) neural networks, respectively. Kraken can execute SNN real-time inference for depth estimation at 1.02 k inf/s, 18μJ/inf, TNN real-time inference for object classification at 10 k inf/s, 6μJ/inf, and real-time inference for obstacle avoidance at 221 frame/s, 750 μJ/inf.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.