Several many-core designs tackle scalability issues by leveraging tightly-coupled clusters as building blocks, where low-latency, high-bandwidth interconnection between a small/medium number of cores and L1 memory achieves high performance/watt. Tight coupling of hardware accelerators into these multicore clusters constitutes a promising approach to further improve performance/area/watt. However, accelerators are often clocked at a lower frequency than processor clusters for energy efficiency reasons. In this paper, we propose a technique to integrate shared-memory accelerators within the tightly-coupled clusters of the STMicroelectronics STHORM architecture. Our methodology significantly relaxes timing constraints for tightly-coupled accelerators, while optimizing data bandwidth. In addition, our technique allows to operate the accelerator at an integer submultiple of the cluster frequency. Experimental results show that the proposed approach allows to recover up to 84% of the slow-down implied by reduced accelerator speed.

Francesco Conti, Andrea Marongiu, Luca Benini (2013). Synthesis-friendly techniques for tightly-coupled integration of hardware accelerators into shared-memory multi-core clusters. 2013 IEEE [10.1109/CODES-ISSS.2013.6658992].

Synthesis-friendly techniques for tightly-coupled integration of hardware accelerators into shared-memory multi-core clusters

CONTI, FRANCESCO;MARONGIU, ANDREA;BENINI, LUCA
2013

Abstract

Several many-core designs tackle scalability issues by leveraging tightly-coupled clusters as building blocks, where low-latency, high-bandwidth interconnection between a small/medium number of cores and L1 memory achieves high performance/watt. Tight coupling of hardware accelerators into these multicore clusters constitutes a promising approach to further improve performance/area/watt. However, accelerators are often clocked at a lower frequency than processor clusters for energy efficiency reasons. In this paper, we propose a technique to integrate shared-memory accelerators within the tightly-coupled clusters of the STMicroelectronics STHORM architecture. Our methodology significantly relaxes timing constraints for tightly-coupled accelerators, while optimizing data bandwidth. In addition, our technique allows to operate the accelerator at an integer submultiple of the cluster frequency. Experimental results show that the proposed approach allows to recover up to 84% of the slow-down implied by reduced accelerator speed.
2013
2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)
1
10
Francesco Conti, Andrea Marongiu, Luca Benini (2013). Synthesis-friendly techniques for tightly-coupled integration of hardware accelerators into shared-memory multi-core clusters. 2013 IEEE [10.1109/CODES-ISSS.2013.6658992].
Francesco Conti;Andrea Marongiu;Luca Benini
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/307320
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact