Manufacturing and environmental variations cause timing errors that are typically avoided by conservative design guardbands or corrected by circuit level error detection and correction. These measures incur energy and performance penalties. This paper considers methods to reduce this cost by expanding the scope of variability mitigation through the software stack. In particular, we propose workload deployment methods that reduce the likelihood of timing errors in shared memory clusters of processor cores. This and other methods are incorporated in a runtime layer in the OpenMP framework that enables parsimonious countermeasures against timing errors induced by hardware variability. The runtime system 'introspectively' monitors the costs of tasks execution on various cores and transparently associates descrIPtive metadata with the tasks. By utilizing the characterized metadata, we propose several policies that enhance the cluster choices for scheduling tasks to cores according to measured hardware variability and system workload. We devise efficient task scheduling strategies for simultaneous management of variability and workload by exploiting centralized and distributed approaches to workload distribution. Both schedulers surpass current state-of-the-art approaches; the distributed (or the centralized) achieves on average 30% (or 17%) energy, and 17% (4%) performance improvement.

Task scheduling strategies to mitigate hardware variability in embedded shared memory clusters

CESARINI, DANIELE;MARONGIU, ANDREA;BENINI, LUCA
2015

Abstract

Manufacturing and environmental variations cause timing errors that are typically avoided by conservative design guardbands or corrected by circuit level error detection and correction. These measures incur energy and performance penalties. This paper considers methods to reduce this cost by expanding the scope of variability mitigation through the software stack. In particular, we propose workload deployment methods that reduce the likelihood of timing errors in shared memory clusters of processor cores. This and other methods are incorporated in a runtime layer in the OpenMP framework that enables parsimonious countermeasures against timing errors induced by hardware variability. The runtime system 'introspectively' monitors the costs of tasks execution on various cores and transparently associates descrIPtive metadata with the tasks. By utilizing the characterized metadata, we propose several policies that enhance the cluster choices for scheduling tasks to cores according to measured hardware variability and system workload. We devise efficient task scheduling strategies for simultaneous management of variability and workload by exploiting centralized and distributed approaches to workload distribution. Both schedulers surpass current state-of-the-art approaches; the distributed (or the centralized) achieves on average 30% (or 17%) energy, and 17% (4%) performance improvement.
Proceedings - Design Automation Conference
1
6
Rahimi, Abbas; Cesarini, Daniele; Marongiu, Andrea; Gupta, Rajesh K.; Benini, Luca
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/545251
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 5
social impact