CRIS Current Research Information System

Nowadays, many Internet of Things (IoT) systems rely on sensing units that offload data to remote cloud servers for analytics. While this approach provides the computational power required to execute complex Deep Learning (DL) tasks, it introduces privacy vulnerabilities and becomes unfeasible in scenarios with constrained network bandwidth. In this paper, we investigate the possibility of completely offloading DL inference tasks to the Extreme Edge (EE) of an IoT system, consisting of a multi-hop network of microcontrollers or low-power PCs. To this end, we explore the splitting of DL models across the physical topology, taking into account the heterogeneity of EE devices and the characteristics of wireless links. To balance the trade-off between model accuracy and resource limitations, we focus on mixed-precision quantization strategies that adjust the precision of each sub-model based on the hardware capabilities of the target devices. Beyond the optimization problem formulation, we propose a Genetic Algorithm (GA) that determines the best model allocation and in-network inference path within the multi-hop IoT network by jointly optimizing energy efficiency and latency. Experimental results on three widely adopted DNN architectures (MobileNetV2, ResNet50, and VGG16) demonstrate that the proposed GA achieves up to a 66% reduction in the fitness function compared to the baseline greedy algorithm.

Trotta, A., Esposito, A., Sciullo, L., Bononi, L., Di Felice, M. (2026). Private Inference at the Extreme Edge: Joint Mixed Precision Quantization and Model Splitting in Multi-Hop IoT Networks [10.1109/ccnc65079.2026.11366617].

Private Inference at the Extreme Edge: Joint Mixed Precision Quantization and Model Splitting in Multi-Hop IoT Networks

Trotta, Angelo;Esposito, Alfonso;Sciullo, Luca;Bononi, Luciano;Di Felice, Marco

2026

Abstract

Nowadays, many Internet of Things (IoT) systems rely on sensing units that offload data to remote cloud servers for analytics. While this approach provides the computational power required to execute complex Deep Learning (DL) tasks, it introduces privacy vulnerabilities and becomes unfeasible in scenarios with constrained network bandwidth. In this paper, we investigate the possibility of completely offloading DL inference tasks to the Extreme Edge (EE) of an IoT system, consisting of a multi-hop network of microcontrollers or low-power PCs. To this end, we explore the splitting of DL models across the physical topology, taking into account the heterogeneity of EE devices and the characteristics of wireless links. To balance the trade-off between model accuracy and resource limitations, we focus on mixed-precision quantization strategies that adjust the precision of each sub-model based on the hardware capabilities of the target devices. Beyond the optimization problem formulation, we propose a Genetic Algorithm (GA) that determines the best model allocation and in-network inference path within the multi-hop IoT network by jointly optimizing energy efficiency and latency. Experimental results on three widely adopted DNN architectures (MobileNetV2, ResNet50, and VGG16) demonstrate that the proposed GA achieves up to a 66% reduction in the fitness function compared to the baseline greedy algorithm.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Titolo del volume
	
				2026 IEEE 23rd Consumer Communications & Networking Conference (CCNC)
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				6
			
	Collana/Serie
	
				IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE
			
	Codice DOI
	
				https://dx.doi.org/10.1109/ccnc65079.2026.11366617
			
	Citazione
	
				Trotta, A., Esposito, A., Sciullo, L., Bononi, L., Di Felice, M. (2026). Private Inference at the Extreme Edge: Joint Mixed Precision Quantization and Model Splitting in Multi-Hop IoT Networks [10.1109/ccnc65079.2026.11366617].
			
	Tutti gli autori
	
						Trotta, Angelo; Esposito, Alfonso; Sciullo, Luca; Bononi, Luciano; Di Felice, Marco
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
CCNC_2026___Split-12.pdf embargo fino al 03/08/2027 Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 1.44 MB Formato Adobe PDF Visualizza/Apri Contatta l'autore	1.44 MB	Adobe PDF	Visualizza/Apri Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1051471

Citazioni

ND

0

0

0

social impact