Modern computer vision and image processing embedded systems exploit hardware acceleration inside scalable parallel architectures, such as tightly-coupled clusters, to achieve stringent performance and energy efficiency targets. Architectural heterogeneity typically makes software development cumbersome, thus shared memory processor-to-accelerator communication is typically preferred to simplify code offioading to HW IPs for critical computational kernels. However, tightly coupling a large number of accelerators and processors in a shared memory cluster is a challenging task, since the complexity of the resulting system quickly becomes too large. We tackle these issues by proposing a template of heterogeneous shared memory cluster which scales to a large number of accelerators, achieving up to 40% better performance/area/watt than simply designing larger main interconnects to accommodate several HW IPs. In addition, following a trend towards standardization of acceleration capabilities of future embedded systems, we develop a programming model which simplifies application development for heterogeneous clusters.

Architecture and programming model support for efficient heterogeneous computing on tigthly-coupled shared-memory clusters / Burgio P.; Marongiu A.; Danilo R.; Coussy P.; Benini L.. - STAMPA. - (2013), pp. 22-29. (Intervento presentato al convegno Design and Architectures for Signal and Image Processing (DASIP) tenutosi a Cagliari, Italy nel 8-10 Oct. 2013).

Architecture and programming model support for efficient heterogeneous computing on tigthly-coupled shared-memory clusters

BURGIO, PAOLO;MARONGIU, ANDREA;BENINI, LUCA
2013

Abstract

Modern computer vision and image processing embedded systems exploit hardware acceleration inside scalable parallel architectures, such as tightly-coupled clusters, to achieve stringent performance and energy efficiency targets. Architectural heterogeneity typically makes software development cumbersome, thus shared memory processor-to-accelerator communication is typically preferred to simplify code offioading to HW IPs for critical computational kernels. However, tightly coupling a large number of accelerators and processors in a shared memory cluster is a challenging task, since the complexity of the resulting system quickly becomes too large. We tackle these issues by proposing a template of heterogeneous shared memory cluster which scales to a large number of accelerators, achieving up to 40% better performance/area/watt than simply designing larger main interconnects to accommodate several HW IPs. In addition, following a trend towards standardization of acceleration capabilities of future embedded systems, we develop a programming model which simplifies application development for heterogeneous clusters.
2013
Design and Architectures for Signal and Image Processing (DASIP), 2013 Conference on
22
29
Architecture and programming model support for efficient heterogeneous computing on tigthly-coupled shared-memory clusters / Burgio P.; Marongiu A.; Danilo R.; Coussy P.; Benini L.. - STAMPA. - (2013), pp. 22-29. (Intervento presentato al convegno Design and Architectures for Signal and Image Processing (DASIP) tenutosi a Cagliari, Italy nel 8-10 Oct. 2013).
Burgio P.; Marongiu A.; Danilo R.; Coussy P.; Benini L.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/388145
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact