CRIS Current Research Information System

With the shrinking of technology nodes and the use of parallel processor clusters in hostile and critical environments, such as space, run-time faults caused by radiation are a serious cross-cutting concern, also impacting architectural design. This paper introduces an architectural approach to run-time configurable soft-error tolerance at the core level, augmenting a six-core open-source RISC-V cluster with a novel On-Demand Redundancy Grouping (ODRG) scheme. ODRG allows the cluster to operate either as two fault-tolerant cores, or six individual cores for high-performance, with limited overhead to switch between these modes during run-time. The ODRG unit adds less than 11% of a core's area for a three-core group, or a total of 1% of the cluster area, and shows negligible timing increase, which compares favorably to a commercial state-of-the-art implementation, and is 2.5× faster in fault recovery re-synchronization. Furthermore, when redundancy is not necessary, the ODRG approach allows the redundant cores to be used for independent computation, allowing up to 2.96× increase in performance for selected applications.

Rogenmoser, M., Wistoff, N., Vogel, P., Gurkaynak, F., Benini, L. (2022). On-Demand Redundancy Grouping: Selectable Soft-Error Tolerance for a Multicore Cluster [10.1109/ISVLSI54635.2022.00089].

On-Demand Redundancy Grouping: Selectable Soft-Error Tolerance for a Multicore Cluster

Rogenmoser M.;Wistoff N.;Vogel P.;Gurkaynak F.;Benini L.

2022

Abstract

With the shrinking of technology nodes and the use of parallel processor clusters in hostile and critical environments, such as space, run-time faults caused by radiation are a serious cross-cutting concern, also impacting architectural design. This paper introduces an architectural approach to run-time configurable soft-error tolerance at the core level, augmenting a six-core open-source RISC-V cluster with a novel On-Demand Redundancy Grouping (ODRG) scheme. ODRG allows the cluster to operate either as two fault-tolerant cores, or six individual cores for high-performance, with limited overhead to switch between these modes during run-time. The ODRG unit adds less than 11% of a core's area for a three-core group, or a total of 1% of the cluster area, and shows negligible timing increase, which compares favorably to a commercial state-of-the-art implementation, and is 2.5× faster in fault recovery re-synchronization. Furthermore, when redundancy is not necessary, the ODRG approach allows the redundant cores to be used for independent computation, allowing up to 2.96× increase in performance for selected applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Titolo del volume
	
				2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)
			
	Pagina iniziale
	
				398
			
	Pagina finale
	
				401
			
	Codice DOI
	
				https://dx.doi.org/10.1109/ISVLSI54635.2022.00089
			
	Citazione
	
				Rogenmoser, M., Wistoff, N., Vogel, P., Gurkaynak, F., Benini, L. (2022). On-Demand Redundancy Grouping: Selectable Soft-Error Tolerance for a Multicore Cluster [10.1109/ISVLSI54635.2022.00089].
			
	Tutti gli autori
	
						Rogenmoser, M.; Wistoff, N.; Vogel, P.; Gurkaynak, F.; Benini, L.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
ISVLSI_On_Demand_Redundancy_Grouping_Final.pdf accesso aperto Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 180.48 kB Formato Adobe PDF Visualizza/Apri	180.48 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/907557

Citazioni

ND

11

8

social impact