CRIS Current Research Information System

With the introduction of more powerful and massively parallel embedded processors, embedded systems are becoming HPC-capable. Heterogeneous on-chip systems (SoC) that couple a general-purposehost processor to a many-core accelerator are becoming more and more widespread, and provide tremendous peak performance/watt, well suited to execute HPC-class programs. The increased computation potential is however traded off for ease programming. Application developers are indeed required to manually deal with outlining code parts suitable for acceleration, parallelize them efficiently over many available cores, and orchestrate data transfers to/from the accelerator. In addition, since most many-cores are organized as a collection ofclusters, featuring fast local communication but slow remote communication (i.e., to another cluster's local memory), the programmer should also take care of properly mapping the parallel computation so as to avoid poor data locality. OpenMP v4.0 introduces new constructs for computation offloading, as well as directives to deploy parallel computation in a cluster-aware manner. In this paper we assess the effectiveness of OpenMP v4.0 at exploiting the massive parallelism available in embedded heterogeneous SoCs, comparing to standard parallel loops over several computation-intensive applications from the linear algebra and image processing domains.

Capotondi, A., Marongiu, A. (2016). On the effectiveness of OpenMP teams for cluster-based many-core accelerators. Institute of Electrical and Electronics Engineers Inc. [10.1109/HPCSim.2016.7568399].

On the effectiveness of OpenMP teams for cluster-based many-core accelerators

CAPOTONDI, ALESSANDRO;MARONGIU, ANDREA

2016

Abstract

With the introduction of more powerful and massively parallel embedded processors, embedded systems are becoming HPC-capable. Heterogeneous on-chip systems (SoC) that couple a general-purposehost processor to a many-core accelerator are becoming more and more widespread, and provide tremendous peak performance/watt, well suited to execute HPC-class programs. The increased computation potential is however traded off for ease programming. Application developers are indeed required to manually deal with outlining code parts suitable for acceleration, parallelize them efficiently over many available cores, and orchestrate data transfers to/from the accelerator. In addition, since most many-cores are organized as a collection ofclusters, featuring fast local communication but slow remote communication (i.e., to another cluster's local memory), the programmer should also take care of properly mapping the parallel computation so as to avoid poor data locality. OpenMP v4.0 introduces new constructs for computation offloading, as well as directives to deploy parallel computation in a cluster-aware manner. In this paper we assess the effectiveness of OpenMP v4.0 at exploiting the massive parallelism available in embedded heterogeneous SoCs, comparing to standard parallel loops over several computation-intensive applications from the linear algebra and image processing domains.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Titolo del volume
	
				2016 International Conference on High Performance Computing and Simulation, HPCS 2016
			
	Pagina iniziale
	
				667
			
	Pagina finale
	
				674
			
	Codice DOI
	
				https://dx.doi.org/10.1109/HPCSim.2016.7568399
			
	Citazione
	
				Capotondi, A., Marongiu, A. (2016). On the effectiveness of OpenMP teams for cluster-based many-core accelerators. Institute of Electrical and Electronics Engineers Inc. [10.1109/HPCSim.2016.7568399].
			
	Tutti gli autori
	
						Capotondi, Alessandro; Marongiu, Andrea
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
capotondi_HPCS16.pdf accesso aperto Descrizione: Postprint paper Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 341.2 kB Formato Adobe PDF Visualizza/Apri	341.2 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/575144

Citazioni

ND

3

2

social impact