CRIS Current Research Information System

Convolutional Neural Networks (CNNs) are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition and segmentation. Recent research results demonstrate that multi-layer (deep) network involving mono-dimensional convolutions and dilation can be effectively used in time series and sequences classification and segmentation, as well as in tasks involving sequence modeling. These structures, commonly referred to as Temporal Convolutional Networks (TCNs), represent an extremely promising alternative to recurrent architectures, commonly used across a broad range of sequence modeling tasks. While FPGA based inference accelerators for classic CNNs are widespread, literature is lacking in a quantitative evaluation of their usability on inference for TCN models. In this paper we present such an evaluation, considering a CNN accelerator with specific features supporting TCN kernels as a reference and a set of state-of-the-art TCNs as a benchmark. Experimental results show that, during TCN execution, operational intensity can be critical for the overall performance. We propose a convolution scheduling based on batch processing that can boost efficiency up to 96% of theoretical peak performance. Overall we can achieve up to 111,8 GOPS/s and a power efficiency of 33,8 GOPS/s/W on an Ultrascale+ ZU3EG (up to 10× speedup and 3× power efficiency improvement with respect to pure software implementation).

Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators / Carreras M.; Deriu G.; Raffo L.; Benini L.; Meloni P.. - In: IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS. - ISSN 2156-3357. - STAMPA. - 10:3(2020), pp. 9159637.348-9159637.361. [10.1109/JETCAS.2020.3014503]

Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators

Carreras M.;Deriu G.;Raffo L.;Benini L.;Meloni P.

2020

Abstract

Convolutional Neural Networks (CNNs) are extensively used in a wide range of applications, commonly including computer vision tasks like image and video classification, recognition and segmentation. Recent research results demonstrate that multi-layer (deep) network involving mono-dimensional convolutions and dilation can be effectively used in time series and sequences classification and segmentation, as well as in tasks involving sequence modeling. These structures, commonly referred to as Temporal Convolutional Networks (TCNs), represent an extremely promising alternative to recurrent architectures, commonly used across a broad range of sequence modeling tasks. While FPGA based inference accelerators for classic CNNs are widespread, literature is lacking in a quantitative evaluation of their usability on inference for TCN models. In this paper we present such an evaluation, considering a CNN accelerator with specific features supporting TCN kernels as a reference and a set of state-of-the-art TCNs as a benchmark. Experimental results show that, during TCN execution, operational intensity can be critical for the overall performance. We propose a convolution scheduling based on batch processing that can boost efficiency up to 96% of theoretical peak performance. Overall we can achieve up to 111,8 GOPS/s and a power efficiency of 33,8 GOPS/s/W on an Ultrascale+ ZU3EG (up to 10× speedup and 3× power efficiency improvement with respect to pure software implementation).

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2020
		
	Rivista
	
			IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS
		
	Codice DOI
	
			https://dx.doi.org/10.1109/JETCAS.2020.3014503
		
	Citazione
	
			Optimizing Temporal Convolutional Network Inference on FPGA-Based Accelerators / Carreras M.; Deriu G.; Raffo L.; Benini L.; Meloni P.. - In: IEEE JOURNAL OF EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS. - ISSN 2156-3357. - STAMPA. - 10:3(2020), pp. 9159637.348-9159637.361. [10.1109/JETCAS.2020.3014503]
		
	Tutti gli autori
	
			Carreras M.; Deriu G.; Raffo L.; Benini L.; Meloni P.
		
	Appare nelle tipologie:
	
			1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
optimizing temporal convolutional post print.pdf accesso aperto Tipo: Postprint Licenza: Licenza per accesso libero gratuito Dimensione 1.31 MB Formato Adobe PDF Visualizza/Apri	1.31 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/791974

Citazioni

ND

18

17

social impact