The ever-increasing need for computational power in embedded devices has led to the adoption heterogeneous SoCs combining a general purpose CPU with a data parallel accelerator. These systems rely on a shared main memory (DRAM), which makes them highly susceptible to memory interference. A promising software technique to counter such effects is the Predictable Execution Model (PREM). PREM ensures robustness to interference by separating programs into a sequence of memory and compute phases, and by enforcing a platform-level schedule where only a single processing subsystem is permitted to execute a memory phase at a time. This article demonstrates for the first time how PREM can be applied to heterogeneous SoCs, based on a synchronization technique for memory isolation between CPU and GPU plus a compiler to transform GPU kernels into PREM-compliant codes. For compute bound GPU workloads sharing the DRAM bandwidth 50/50 with the CPU we guarantee near-zero timing varibility at a performance loss of just 59 percent, which is one to two orders of magnitude smaller than the worst case we see for unmodified programs under memory interference.

HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs / Forsberg B.; Benini L.; Marongiu A.. - In: IEEE TRANSACTIONS ON COMPUTERS. - ISSN 0018-9340. - ELETTRONICO. - 70:1(2021), pp. 9035630.17-9035630.29. [10.1109/TC.2020.2980520]

HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs

Benini L.;
2021

Abstract

The ever-increasing need for computational power in embedded devices has led to the adoption heterogeneous SoCs combining a general purpose CPU with a data parallel accelerator. These systems rely on a shared main memory (DRAM), which makes them highly susceptible to memory interference. A promising software technique to counter such effects is the Predictable Execution Model (PREM). PREM ensures robustness to interference by separating programs into a sequence of memory and compute phases, and by enforcing a platform-level schedule where only a single processing subsystem is permitted to execute a memory phase at a time. This article demonstrates for the first time how PREM can be applied to heterogeneous SoCs, based on a synchronization technique for memory isolation between CPU and GPU plus a compiler to transform GPU kernels into PREM-compliant codes. For compute bound GPU workloads sharing the DRAM bandwidth 50/50 with the CPU we guarantee near-zero timing varibility at a performance loss of just 59 percent, which is one to two orders of magnitude smaller than the worst case we see for unmodified programs under memory interference.
2021
HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs / Forsberg B.; Benini L.; Marongiu A.. - In: IEEE TRANSACTIONS ON COMPUTERS. - ISSN 0018-9340. - ELETTRONICO. - 70:1(2021), pp. 9035630.17-9035630.29. [10.1109/TC.2020.2980520]
Forsberg B.; Benini L.; Marongiu A.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/792001
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 8
social impact