Convolutional Neural Networks (CNNs) are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition and segmentation. Recent research results demonstrate that multi-layer (deep) network involving mono-dimensional convolutions and dilation can be effectively used in time series and sequences classification and segmentation, as well as in tasks involving sequence modeling. These structures, commonly referred to as Temporal Convolutional Networks (TCNs), represent an extremely promising alternative to recurrent architectures, commonly used across a broad range of sequence modeling tasks. While FPGA based inference accelerators for classic CNNs are widespread, literature is lacking in a quantitative evaluation of their usability on inference for TCN models. In this paper we present such an evaluation, considering a CNN accelerator with specific features supporting TCN kernels as a reference and a set of state-of-the-art TCNs as a benchmark. Experimental results show that, during TCN execution, operational intensity can be critical for the overall performance. We propose a convolution scheduling based on batch processing that can boost efficiency up to 96% of theoretical peak performance. Overall we can achieve up to 111,8 GOPS/s and a power efficiency of 33,8 GOPS/s/W on an Ultrascale+ ZU3EG (up to 10× speedup and 3× power efficiency improvement with respect to pure software implementation).

Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators / Carreras M.; Deriu G.; Raffo L.; Benini L.; Meloni P.. - In: IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS. - ISSN 2156-3357. - STAMPA. - 10:3(2020), pp. 9159637.348-9159637.361. [10.1109/JETCAS.2020.3014503]

Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators

Benini L.;
2020

Abstract

Convolutional Neural Networks (CNNs) are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition and segmentation. Recent research results demonstrate that multi-layer (deep) network involving mono-dimensional convolutions and dilation can be effectively used in time series and sequences classification and segmentation, as well as in tasks involving sequence modeling. These structures, commonly referred to as Temporal Convolutional Networks (TCNs), represent an extremely promising alternative to recurrent architectures, commonly used across a broad range of sequence modeling tasks. While FPGA based inference accelerators for classic CNNs are widespread, literature is lacking in a quantitative evaluation of their usability on inference for TCN models. In this paper we present such an evaluation, considering a CNN accelerator with specific features supporting TCN kernels as a reference and a set of state-of-the-art TCNs as a benchmark. Experimental results show that, during TCN execution, operational intensity can be critical for the overall performance. We propose a convolution scheduling based on batch processing that can boost efficiency up to 96% of theoretical peak performance. Overall we can achieve up to 111,8 GOPS/s and a power efficiency of 33,8 GOPS/s/W on an Ultrascale+ ZU3EG (up to 10× speedup and 3× power efficiency improvement with respect to pure software implementation).
2020
Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators / Carreras M.; Deriu G.; Raffo L.; Benini L.; Meloni P.. - In: IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS. - ISSN 2156-3357. - STAMPA. - 10:3(2020), pp. 9159637.348-9159637.361. [10.1109/JETCAS.2020.3014503]
Carreras M.; Deriu G.; Raffo L.; Benini L.; Meloni P.
File in questo prodotto:
File Dimensione Formato  
optimizing temporal convolutional post print.pdf

accesso aperto

Tipo: Postprint
Licenza: Licenza per accesso libero gratuito
Dimensione 1.31 MB
Formato Adobe PDF
1.31 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/791974
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 18
  • ???jsp.display-item.citation.isi??? 17
social impact