SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation

Belano, Andrea; Tortorella, Yvan; Garofalo, Angelo; Benini, Luca; Rossi, Davide; Conti, Francesco

doi:10.23919/date64628.2025.10993043

Transformer-based models excel in NLP, vision, and audio processing, but the softmax operator can be a performance bottleneck, especially with optimized matrix-multiplication hard-ware. We introduce SoftEx, a parametric accelerator for BF16 softmax, using approximate exponentiation «0.14% relative error) to boost softmax calculation. Integrated into a 12nm octa-core RISC-V cluster together with a matrix-multiplication systolic array, SoftEx reduces time and energy for attention probability computation by up to 10.8x and 26.8x, boosting MobileBERT throughput by 2.17x to 324 GOPS or 1.30 TOPS/W.

Belano, A., Tortorella, Y., Garofalo, A., Benini, L., Rossi, D., Conti, F. (2025). SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation. Institute of Electrical and Electronics Engineers Inc. [10.23919/date64628.2025.10993043].

SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation

Belano, Andrea;Tortorella, Yvan;Garofalo, Angelo;Benini, Luca;Rossi, Davide;Conti, Francesco

2025

Abstract

Transformer-based models excel in NLP, vision, and audio processing, but the softmax operator can be a performance bottleneck, especially with optimized matrix-multiplication hard-ware. We introduce SoftEx, a parametric accelerator for BF16 softmax, using approximate exponentiation «0.14% relative error) to boost softmax calculation. Integrated into a 12nm octa-core RISC-V cluster together with a matrix-multiplication systolic array, SoftEx reduces time and energy for attention probability computation by up to 10.8x and 26.8x, boosting MobileBERT throughput by 2.17x to 324 GOPS or 1.30 TOPS/W.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				Proceedings -Design, Automation and Test in Europe, DATE
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				2
			
	Collana/Serie
	
				PROCEEDINGS - DESIGN, AUTOMATION, AND TEST IN EUROPE CONFERENCE AND EXHIBITION
			
	Codice DOI
	
				https://dx.doi.org/10.23919/date64628.2025.10993043
			
	Citazione
	
				Belano, A., Tortorella, Y., Garofalo, A., Benini, L., Rossi, D., Conti, F. (2025). SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation. Institute of Electrical and Electronics Engineers Inc. [10.23919/date64628.2025.10993043].
			
	Tutti gli autori
	
						Belano, Andrea; Tortorella, Yvan; Garofalo, Angelo; Benini, Luca; Rossi, Davide; Conti, Francesco

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1040755

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

ND

CRIS Current Research Information System