CRIS Current Research Information System

The crisis of Moore's law and new dominant Machine Learning workloads require a paradigm shift towards finely tunable-precision (a.k.a. transprecision) computing. More specifically, we need floating-point circuits that are capable to operate on many formats with high flexibility. We present the first silicon implementation of a 64-bit transprecision floating-point unit. It fully supports the standard double, single, and half precision, alongside custom bfloat and 8 bit formats. Operations occur on scalars or 2, 4, or 8-way SIMD vectors. We have integrated the 247 kGE unit into a 64 bit application-class RISC-V processor core, where the added transprecision support accounts for an energy and area overhead of merely 11 and 9, respectively; yet achieving speedups and per-datum energy gains of 7.3x and 7.94x. We implemented the design in a 22 nm FD-SOI technology. The unit achieves energy efficiencies between 75 Gflop/sW and 1.24 Tflop/sW, and a performance between 1.85 Gflop/s and 14.83 Gflop/s, across formats.

Mach S., Schuiki F., Zaruba F., Benini L. (2019). A 0.80pJ/flop, 1.24Tflop/sW 8-to-64 bit Transprecision Floating-Point Unit for a 64 bit RISC-V Processor in 22nm FD-SOI. IEEE Computer Society [10.1109/VLSI-SoC.2019.8920307].

A 0.80pJ/flop, 1.24Tflop/sW 8-to-64 bit Transprecision Floating-Point Unit for a 64 bit RISC-V Processor in 22nm FD-SOI

Mach S.;Schuiki F.;Zaruba F.;Benini L.

2019

Abstract

The crisis of Moore's law and new dominant Machine Learning workloads require a paradigm shift towards finely tunable-precision (a.k.a. transprecision) computing. More specifically, we need floating-point circuits that are capable to operate on many formats with high flexibility. We present the first silicon implementation of a 64-bit transprecision floating-point unit. It fully supports the standard double, single, and half precision, alongside custom bfloat and 8 bit formats. Operations occur on scalars or 2, 4, or 8-way SIMD vectors. We have integrated the 247 kGE unit into a 64 bit application-class RISC-V processor core, where the added transprecision support accounts for an energy and area overhead of merely 11 and 9, respectively; yet achieving speedups and per-datum energy gains of 7.3x and 7.94x. We implemented the design in a 22 nm FD-SOI technology. The unit achieves energy efficiencies between 75 Gflop/sW and 1.24 Tflop/sW, and a performance between 1.85 Gflop/s and 14.83 Gflop/s, across formats.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Titolo del volume
	
				IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC
			
	Pagina iniziale
	
				95
			
	Pagina finale
	
				98
			
	Codice DOI
	
				https://dx.doi.org/10.1109/VLSI-SoC.2019.8920307
			
	Citazione
	
				Mach S.,  Schuiki F.,  Zaruba F.,  Benini L. (2019). A 0.80pJ/flop, 1.24Tflop/sW 8-to-64 bit Transprecision Floating-Point Unit for a 64 bit RISC-V Processor in 22nm FD-SOI. IEEE Computer Society [10.1109/VLSI-SoC.2019.8920307].
			
	Tutti gli autori
	
						Mach S.; Schuiki F.; Zaruba F.; Benini L.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
A 0.80 pJ-flop, 1.24 Tflop-sW 8-to-64 bit_preprint.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per accesso libero gratuito Dimensione 1 MB Formato Adobe PDF Visualizza/Apri	1 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/730316

Citazioni

ND

15

ND

ND

social impact