Training deep networks on light computational devices is nowadays very challenging. Continual learning techniques, where complex models are incrementally trained on small batches of new data, can make the learning problem tractable even for CPU-only edge devices. However, a number of practical problems need to be solved: catastrophic forgetting before anything else. In this paper we introduce an original technique named ``Latent Replay'' where, instead of storing a portion of past data in the input space, we store activations volumes at some intermediate layer. This can significantly reduce the computation and storage required by native rehearsal. To keep the representation stable and the stored activations valid we propose to slow-down learning at all the layers below the latent replay one, leaving the layers above free to learn at full pace. In our experiments we show that Latent Replay, combined with existing continual learning techniques, achieves state-of-the-art accuracy on a difficult benchmark such as CORe50 NICv2 with nearly 400 small and highly non-i.i.d. batches. Finally, we demonstrate the feasibility of nearly real-time continual learning on the edge through the porting of the proposed technique on a smartphone device.

Latent Replay for Real-Time Continual Learning

Lorenzo Pellegrini;Gabriele Graffieti;Vincenzo Lomonaco
;
Davide Maltoni
2019

Abstract

Training deep networks on light computational devices is nowadays very challenging. Continual learning techniques, where complex models are incrementally trained on small batches of new data, can make the learning problem tractable even for CPU-only edge devices. However, a number of practical problems need to be solved: catastrophic forgetting before anything else. In this paper we introduce an original technique named ``Latent Replay'' where, instead of storing a portion of past data in the input space, we store activations volumes at some intermediate layer. This can significantly reduce the computation and storage required by native rehearsal. To keep the representation stable and the stored activations valid we propose to slow-down learning at all the layers below the latent replay one, leaving the layers above free to learn at full pace. In our experiments we show that Latent Replay, combined with existing continual learning techniques, achieves state-of-the-art accuracy on a difficult benchmark such as CORe50 NICv2 with nearly 400 small and highly non-i.i.d. batches. Finally, we demonstrate the feasibility of nearly real-time continual learning on the edge through the porting of the proposed technique on a smartphone device.
Lorenzo Pellegrini; Gabriele Graffieti; Vincenzo Lomonaco; Davide Maltoni
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/713387
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact