Hardware-Accelerated Energy-Efficient Synchronization and Communication for Ultra-Low-Power Tightly Coupled Clusters

Glaser, F.; Haugou, G.; Rossi, D.; Huang, Q.; Benini, L.

doi:10.23919/DATE.2019.8715266

Parallel ultra low power computing is emerging as an enabler to meet the growing performance and energy efficiency demands in deeply embedded systems such as the end-nodes of the internet-of-things (IoT). The parallel nature of these systems however adds a significant degree of complexity as processing elements (PEs) need to communicate in various ways to organize and synchronize execution. Naive implementations of these central and non-trivial mechanisms can quickly jeopardize overall system performance and limit the achievable speedup and energy efficiency. To avoid this bottleneck, we present an event-based solution centered around a technology-independent, light-weight and scalable (up to 16 cores) synchronization and communication unit (SCU) and its integration into a shared-memory multicore cluster. Careful design and tight coupling of the SCU to the data interfaces of the cores allows to execute common synchronization procedures with a single instruction. Furthermore, we present hardware support for the common barrier and lock synchronization primitives with a barrier latency of only eleven cycles, independent of the number of involved cores. We demonstrate the efficiency of the solution based on experiments with a post-layout implementation of the multicore cluster in a 22 nm CMOS process where the SCU constitutes less than 2 % of area overhead. Our solution supports parallel sections as small as 100 or 72 cycles with a synchronization overhead of just 10 %, an improvement of up to 14× or 30× with respect to cycle count or energy, respectively, compared to a test-and-set based implementation.

Glaser F., Haugou G., Rossi D., Huang Q., Benini L. (2019). Hardware-Accelerated Energy-Efficient Synchronization and Communication for Ultra-Low-Power Tightly Coupled Clusters. Institute of Electrical and Electronics Engineers Inc. [10.23919/DATE.2019.8715266].

Hardware-Accelerated Energy-Efficient Synchronization and Communication for Ultra-Low-Power Tightly Coupled Clusters

Glaser F.;Haugou G.;Rossi D.;Huang Q.;Benini L.

2019

Abstract

Parallel ultra low power computing is emerging as an enabler to meet the growing performance and energy efficiency demands in deeply embedded systems such as the end-nodes of the internet-of-things (IoT). The parallel nature of these systems however adds a significant degree of complexity as processing elements (PEs) need to communicate in various ways to organize and synchronize execution. Naive implementations of these central and non-trivial mechanisms can quickly jeopardize overall system performance and limit the achievable speedup and energy efficiency. To avoid this bottleneck, we present an event-based solution centered around a technology-independent, light-weight and scalable (up to 16 cores) synchronization and communication unit (SCU) and its integration into a shared-memory multicore cluster. Careful design and tight coupling of the SCU to the data interfaces of the cores allows to execute common synchronization procedures with a single instruction. Furthermore, we present hardware support for the common barrier and lock synchronization primitives with a barrier latency of only eleven cycles, independent of the number of involved cores. We demonstrate the efficiency of the solution based on experiments with a post-layout implementation of the multicore cluster in a 22 nm CMOS process where the SCU constitutes less than 2 % of area overhead. Our solution supports parallel sections as small as 100 or 72 cycles with a synchronization overhead of just 10 %, an improvement of up to 14× or 30× with respect to cycle count or energy, respectively, compared to a test-and-set based implementation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Titolo del volume
	
				Proceedings of the 2019 Design, Automation and Test in Europe Conference and Exhibition, DATE 2019
			
	Pagina iniziale
	
				552
			
	Pagina finale
	
				557
			
	Codice DOI
	
				https://dx.doi.org/10.23919/DATE.2019.8715266
			
	Citazione
	
				Glaser F.,  Haugou G.,  Rossi D.,  Huang Q.,  Benini L. (2019). Hardware-Accelerated Energy-Efficient Synchronization and Communication for Ultra-Low-Power Tightly Coupled Clusters. Institute of Electrical and Electronics Engineers Inc. [10.23919/DATE.2019.8715266].
			
	Tutti gli autori
	
						Glaser F.; Haugou G.; Rossi D.; Huang Q.; Benini L.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/729710

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

5

5

ND

CRIS Current Research Information System