The end of Dennardian scaling in advanced technologies brought about new architectural templates to overcome the so-called utilization wall and provide Moore’s Law-like performance and energy scaling in embedded SoCs. One of the most promising templates, architectural heterogeneity, is hindered by high cost due to the design space explosion and the lack of effective exploration tools. Our work provides three contributions towards a scalable and effective methodology for design space exploration in embedded MC-SoCs. First, we present the He-P2012 architecture, augmenting the state-of-art STMicroelectronics P2012 platform with heterogeneous shared-L1 coprocessors called HW processing elements (HWPE). Second, we propose a novel methodology for the semi-automatic definition and instantiation of shared-memory HWPEs from a C source, supporting both simple and structured data types. Third, we demonstrate that the integration of HWPEs can provide significant performance and energy efficiency benefits on a set of benchmarks originally developed for the homogeneous P2012, achieving up to 123x speedup on the accelerated code region (∼98% of Amdahl’s law limit) while saving 2/3 of the energy.

He-P2012: Performance and Energy Exploration of Architecturally Heterogeneous Many-Cores / Conti, Francesco; Marongiu, Andrea; Pilkington, Chuck; Benini, Luca. - In: JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL, IMAGE, AND VIDEO TECHNOLOGY. - ISSN 1939-8018. - ELETTRONICO. - 85:3(2016), pp. 325-340. [10.1007/s11265-015-1056-7]

He-P2012: Performance and Energy Exploration of Architecturally Heterogeneous Many-Cores

CONTI, FRANCESCO;MARONGIU, ANDREA;BENINI, LUCA
2016

Abstract

The end of Dennardian scaling in advanced technologies brought about new architectural templates to overcome the so-called utilization wall and provide Moore’s Law-like performance and energy scaling in embedded SoCs. One of the most promising templates, architectural heterogeneity, is hindered by high cost due to the design space explosion and the lack of effective exploration tools. Our work provides three contributions towards a scalable and effective methodology for design space exploration in embedded MC-SoCs. First, we present the He-P2012 architecture, augmenting the state-of-art STMicroelectronics P2012 platform with heterogeneous shared-L1 coprocessors called HW processing elements (HWPE). Second, we propose a novel methodology for the semi-automatic definition and instantiation of shared-memory HWPEs from a C source, supporting both simple and structured data types. Third, we demonstrate that the integration of HWPEs can provide significant performance and energy efficiency benefits on a set of benchmarks originally developed for the homogeneous P2012, achieving up to 123x speedup on the accelerated code region (∼98% of Amdahl’s law limit) while saving 2/3 of the energy.
2016
He-P2012: Performance and Energy Exploration of Architecturally Heterogeneous Many-Cores / Conti, Francesco; Marongiu, Andrea; Pilkington, Chuck; Benini, Luca. - In: JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL, IMAGE, AND VIDEO TECHNOLOGY. - ISSN 1939-8018. - ELETTRONICO. - 85:3(2016), pp. 325-340. [10.1007/s11265-015-1056-7]
Conti, Francesco; Marongiu, Andrea; Pilkington, Chuck; Benini, Luca
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/534032
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact