Enabling Real-Time Streaming Temporal Convolution Network Inference on Ultra-Low-Power Microcontrollers

Mirsalari, Seyed Ahmad; Fariselli, Marco; Bijar, Léo; Paci, Francesco; Benini, Luca; Tagliavini, Giuseppe

doi:10.1109/isvlsi65124.2025.11130291

Real-time streaming applications play a pivotal role across diverse domains, including autonomous systems, speech processing, and bio-signal monitoring. Temporal Convolutional Networks (TCNs) effectively model sequences by capturing longterm dependencies, but real-time inference on ultra-low-power microcontrollers (MCUs) remains challenging due to high computational and memory requirements. This work presents a framework to optimize TCN inference for real-time streaming applications by introducing a multi-timestep approach combined with advanced quantization techniques. This solution enables a dynamic adaptation of the streaming application by finding a trade-off between latency and computational efficiency. Deploying a speech enhancement model (Conv-TasNet) on the GAP9 ultra-low-power MCU, we achieve a 2 ms inference time (33% of the real-time constraint of 6.25 ms), along with a 108.9 × reduction in MAC operations and a 27.7 × cycle reduction. Using four timesteps increases the MAC/Cycle ratio to 3.3 while maintaining a 4.3 ms inference time, less than 18% of the extended realtime budget (25 ms). Combining INT8-BFP16 mixed precision quantization and multi-timestep processing delivers a 4 × memory saving at the same performance.

Mirsalari, S.A., Fariselli, M., Bijar, L., Paci, F., Benini, L., Tagliavini, G. (2025). Enabling Real-Time Streaming Temporal Convolution Network Inference on Ultra-Low-Power Microcontrollers. New York (USA) : IEEE Computer Society [10.1109/isvlsi65124.2025.11130291].

Enabling Real-Time Streaming Temporal Convolution Network Inference on Ultra-Low-Power Microcontrollers

Mirsalari, Seyed Ahmad;Fariselli, Marco;Bijar, Léo;Paci, Francesco;Benini, Luca;Tagliavini, Giuseppe

2025

Abstract

Real-time streaming applications play a pivotal role across diverse domains, including autonomous systems, speech processing, and bio-signal monitoring. Temporal Convolutional Networks (TCNs) effectively model sequences by capturing longterm dependencies, but real-time inference on ultra-low-power microcontrollers (MCUs) remains challenging due to high computational and memory requirements. This work presents a framework to optimize TCN inference for real-time streaming applications by introducing a multi-timestep approach combined with advanced quantization techniques. This solution enables a dynamic adaptation of the streaming application by finding a trade-off between latency and computational efficiency. Deploying a speech enhancement model (Conv-TasNet) on the GAP9 ultra-low-power MCU, we achieve a 2 ms inference time (33% of the real-time constraint of 6.25 ms), along with a 108.9 × reduction in MAC operations and a 27.7 × cycle reduction. Using four timesteps increases the MAC/Cycle ratio to 3.3 while maintaining a 4.3 ms inference time, less than 18% of the extended realtime budget (25 ms). Combining INT8-BFP16 mixed precision quantization and multi-timestep processing delivers a 4 × memory saving at the same performance.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				Proceedings of IEEE Computer Society Annual Symposium on VLSI, ISVLSI
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				6
			
	Codice DOI
	
				https://dx.doi.org/10.1109/isvlsi65124.2025.11130291
			
	Citazione
	
				Mirsalari, S.A., Fariselli, M., Bijar, L., Paci, F., Benini, L., Tagliavini, G. (2025). Enabling Real-Time Streaming Temporal Convolution Network Inference on Ultra-Low-Power Microcontrollers. New York (USA) : IEEE Computer Society [10.1109/isvlsi65124.2025.11130291].
			
	Tutti gli autori
	
						Mirsalari, Seyed Ahmad; Fariselli, Marco; Bijar, Léo; Paci, Francesco; Benini, Luca; Tagliavini, Giuseppe

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1033172

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

CRIS Current Research Information System