Ternarized TCN for mu J/Inference Gesture Recognition from DVS Event Frames

Rutishauser, G; Scherer, M; Fischer, T; Benini, L

doi:10.23919/DATE54114.2022.9774592

Dynamic Vision Sensors (DVS) offer the opportunity to scale the energy consumption in image acquisition proportionally to the activity in the captured scene by only transmitting data when the captured image changes. Their potential for energy-proportional sensing makes them highly attractive for severely energy-constrained sensing nodes at the edge. Most approaches to the processing of DVS data employ Spiking Neural Networks to classify the input from the sensor. In this paper, we propose an alternative, event frame-based approach to the classification of DVS video data. We assemble ternary video frames from the event stream and process them with a fully ternarized Temporal Convolutional Network which can be mapped to CUTIE, a highly energy-efficient Ternary Neural Network accelerator. The network mapped to the accelerator achieves a classification accuracy of 94.5%, matching the state of the art for embedded implementations. We implement the processing pipeline in a modern 22nm FDX technology and perform post-synthesis power simulation of the network running on the system, achieving an inference energy of 1.7 mu J, which is 647x lower than previously reported results based on Spiking Neural Networks.

Rutishauser, G., Scherer, M., Fischer, T., Benini, L. (2022). Ternarized TCN for mu J/Inference Gesture Recognition from DVS Event Frames. 345 E 47TH ST, NEW YORK, NY 10017 USA : IEEE [10.23919/DATE54114.2022.9774592].

Ternarized TCN for mu J/Inference Gesture Recognition from DVS Event Frames

Rutishauser, G;Scherer, M;Fischer, T;Benini, L

2022

Abstract

Dynamic Vision Sensors (DVS) offer the opportunity to scale the energy consumption in image acquisition proportionally to the activity in the captured scene by only transmitting data when the captured image changes. Their potential for energy-proportional sensing makes them highly attractive for severely energy-constrained sensing nodes at the edge. Most approaches to the processing of DVS data employ Spiking Neural Networks to classify the input from the sensor. In this paper, we propose an alternative, event frame-based approach to the classification of DVS video data. We assemble ternary video frames from the event stream and process them with a fully ternarized Temporal Convolutional Network which can be mapped to CUTIE, a highly energy-efficient Ternary Neural Network accelerator. The network mapped to the accelerator achieves a classification accuracy of 94.5%, matching the state of the art for embedded implementations. We implement the processing pipeline in a modern 22nm FDX technology and perform post-synthesis power simulation of the network running on the system, achieving an inference energy of 1.7 mu J, which is 647x lower than previously reported results based on Spiking Neural Networks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Titolo del volume
	
				2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)
			
	Pagina iniziale
	
				736
			
	Pagina finale
	
				741
			
	Collana/Serie
	
				PROCEEDINGS - DESIGN, AUTOMATION, AND TEST IN EUROPE CONFERENCE AND EXHIBITION
			
	Codice DOI
	
				https://dx.doi.org/10.23919/DATE54114.2022.9774592
			
	Citazione
	
				Rutishauser, G., Scherer, M., Fischer, T., Benini, L. (2022). Ternarized TCN for mu J/Inference Gesture Recognition from DVS Event Frames. 345 E 47TH ST, NEW YORK, NY 10017 USA : IEEE [10.23919/DATE54114.2022.9774592].
			
	Tutti gli autori
	
						Rutishauser, G; Scherer, M; Fischer, T; Benini, L

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/905406

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

3

3

CRIS Current Research Information System