This paper introduces Multi-Resolution Rescored ByteTrack (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25× by alternating the processing of high-resolution images (320 × 320 pixels) with multiple down-sized frames (192×192 pixels). To tackle the accuracy degradation due to the reduced image input size, MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack, we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: https://github.com/Bomps4/Multi-Resolution-Rescored-ByteTrack

Bompani, L., Rusci, M., Palossi, D., Conti, F., Benini, L. (2024). Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems. IEEE Computer Society [10.1109/CVPRW63382.2024.00223].

Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

Rusci M.;Palossi D.;Benini L.
2024

Abstract

This paper introduces Multi-Resolution Rescored ByteTrack (MR2-ByteTrack), a novel video object detection framework for ultra-low-power embedded processors. This method reduces the average compute load of an off-the-shelf Deep Neural Network (DNN) based object detector by up to 2.25× by alternating the processing of high-resolution images (320 × 320 pixels) with multiple down-sized frames (192×192 pixels). To tackle the accuracy degradation due to the reduced image input size, MR2-ByteTrack correlates the output detections over time using the ByteTrack tracker and corrects potential misclassification using a novel probabilistic Rescore algorithm. By interleaving two down-sized images for every high-resolution one as the input of different state-of-the-art DNN object detectors with our MR2-ByteTrack, we demonstrate an average accuracy increase of 2.16% and a latency reduction of 43% on the GAP9 microcontroller compared to a baseline frame-by-frame inference scheme using exclusively full-resolution images. Code available at: https://github.com/Bomps4/Multi-Resolution-Rescored-ByteTrack
2024
IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
2182
2190
Bompani, L., Rusci, M., Palossi, D., Conti, F., Benini, L. (2024). Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems. IEEE Computer Society [10.1109/CVPRW63382.2024.00223].
Bompani, L.; Rusci, M.; Palossi, D.; Conti, F.; Benini, L.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1004780
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact