Nano VS: a Neural Perception Layer for Fully Onboard Visual Semantic Mapping on Tiny Robots

Rüegg, Thomas; Giordano, Marco; Polonelli, Tommaso; Benini, Luca; Magno, Michele

doi:10.1109/ijcnn64981.2025.11228606

Achieving Simultaneous Localization and Mapping (SLAM) in an unfamiliar environment is a crucial challenge, especially for robots that rely on efficient on-device processing. While accurate mapping is achievable on high-end robotic systems, it still faces substantial challenges due to hardware and latency constraints, especially on smaller robots with limited power budget. Although machine learning is proving highly effective for robot perception, there is a growing need for lightweight solutions in terms of computation and sensing. This paper presents Nano VS, a lightweight monocular perception layer supporting semantic mapping with less than 1 M parameters. We propose a family of quantized and efficient models integrating emerging attention layers and weight-sharing in a multi-task neural network. Experimental results demonstrate multiple tasks within a single model, including Semantic Segmentation (SS), Feature Detection and Description (FDD), and Visual Place Recognition (VPR). Our findings indicate that multi-tasking effectively reduces computational overhead by eliminating the need for multiple networks. Nano VS achieves 70% classwise mIoU with the cityscapes benchmark and 66% Recall@1 in the Pitts30k challenge on tiny images (120x160 pixels). Finally, this paper implements and evaluates Nano VS on a novel milliwatt multi-core RISC-V Microcontroller (MCU), running the full semantic front-end in as little as 52 ms, consuming only 9 mJ per inference. This work represents a significant step towards making advanced SLAM capabilities accessible to tiny robots, or even faster and energy-efficient SLAM on high-end processors.

Rüegg, T., Giordano, M., Polonelli, T., Benini, L., Magno, M. (2025). Nano VS: a Neural Perception Layer for Fully Onboard Visual Semantic Mapping on Tiny Robots. Institute of Electrical and Electronics Engineers Inc. [10.1109/ijcnn64981.2025.11228606].

Nano VS: a Neural Perception Layer for Fully Onboard Visual Semantic Mapping on Tiny Robots

Rüegg, Thomas;Giordano, Marco;Polonelli, Tommaso;Benini, Luca;Magno, Michele

2025

Abstract

Achieving Simultaneous Localization and Mapping (SLAM) in an unfamiliar environment is a crucial challenge, especially for robots that rely on efficient on-device processing. While accurate mapping is achievable on high-end robotic systems, it still faces substantial challenges due to hardware and latency constraints, especially on smaller robots with limited power budget. Although machine learning is proving highly effective for robot perception, there is a growing need for lightweight solutions in terms of computation and sensing. This paper presents Nano VS, a lightweight monocular perception layer supporting semantic mapping with less than 1 M parameters. We propose a family of quantized and efficient models integrating emerging attention layers and weight-sharing in a multi-task neural network. Experimental results demonstrate multiple tasks within a single model, including Semantic Segmentation (SS), Feature Detection and Description (FDD), and Visual Place Recognition (VPR). Our findings indicate that multi-tasking effectively reduces computational overhead by eliminating the need for multiple networks. Nano VS achieves 70% classwise mIoU with the cityscapes benchmark and 66% Recall@1 in the Pitts30k challenge on tiny images (120x160 pixels). Finally, this paper implements and evaluates Nano VS on a novel milliwatt multi-core RISC-V Microcontroller (MCU), running the full semantic front-end in as little as 52 ms, consuming only 9 mJ per inference. This work represents a significant step towards making advanced SLAM capabilities accessible to tiny robots, or even faster and energy-efficient SLAM on high-end processors.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				Proceedings of the International Joint Conference on Neural Networks
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				10
			
	Collana/Serie
	
				PROCEEDINGS OF ... INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/ijcnn64981.2025.11228606
			
	Citazione
	
				Rüegg, T., Giordano, M., Polonelli, T., Benini, L., Magno, M. (2025). Nano VS: a Neural Perception Layer for Fully Onboard Visual Semantic Mapping on Tiny Robots. Institute of Electrical and Electronics Engineers Inc. [10.1109/ijcnn64981.2025.11228606].
			
	Tutti gli autori
	
						Rüegg, Thomas; Giordano, Marco; Polonelli, Tommaso; Benini, Luca; Magno, Michele

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1040834

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

ND

CRIS Current Research Information System