Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

Bompani, L.; Rusci, M.; Palossi, D.; Conti, F.; Benini, L.

doi:10.1109/CVPRW63382.2024.00223

This paper introduces Multi-Resolution Rescored ByteTrack (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25× by alternating the processing of high-resolution images (320 × 320 pixels) with multiple down-sized frames (192×192 pixels). To tackle the accuracy degradation due to the reduced image input size, MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack, we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: https://github.com/Bomps4/Multi-Resolution-Rescored-ByteTrack

Bompani, L., Rusci, M., Palossi, D., Conti, F., Benini, L. (2024). Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems. IEEE Computer Society [10.1109/CVPRW63382.2024.00223].

Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

Bompani L.;Rusci M.;Palossi D.;Conti F.;Benini L.

2024

Abstract

This paper introduces Multi-Resolution Rescored ByteTrack (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25× by alternating the processing of high-resolution images (320 × 320 pixels) with multiple down-sized frames (192×192 pixels). To tackle the accuracy degradation due to the reduced image input size, MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack, we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: https://github.com/Bomps4/Multi-Resolution-Rescored-ByteTrack

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo del volume
	
				IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
			
	Pagina iniziale
	
				2182
			
	Pagina finale
	
				2190
			
	Collana/Serie
	
				IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CVPRW63382.2024.00223
			
	Citazione
	
				Bompani, L., Rusci, M., Palossi, D., Conti, F., Benini, L. (2024). Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems. IEEE Computer Society [10.1109/CVPRW63382.2024.00223].
			
	Tutti gli autori
	
						Bompani, L.; Rusci, M.; Palossi, D.; Conti, F.; Benini, L.

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1004780

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

2

2

CRIS Current Research Information System