Transformer-based models excel in NLP, vision, and audio processing, but the softmax operator can be a performance bottleneck, especially with optimized matrix-multiplication hard-ware. We introduce SoftEx, a parametric accelerator for BF16 softmax, using approximate exponentiation «0.14% relative error) to boost softmax calculation. Integrated into a 12nm octa-core RISC-V cluster together with a matrix-multiplication systolic array, SoftEx reduces time and energy for attention probability computation by up to 10.8x and 26.8x, boosting MobileBERT throughput by 2.17x to 324 GOPS or 1.30 TOPS/W.

Belano, A., Tortorella, Y., Garofalo, A., Benini, L., Rossi, D., Conti, F. (2025). SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation. Institute of Electrical and Electronics Engineers Inc. [10.23919/date64628.2025.10993043].

SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation

Belano, Andrea;Tortorella, Yvan;Garofalo, Angelo;Benini, Luca;Rossi, Davide;Conti, Francesco
2025

Abstract

Transformer-based models excel in NLP, vision, and audio processing, but the softmax operator can be a performance bottleneck, especially with optimized matrix-multiplication hard-ware. We introduce SoftEx, a parametric accelerator for BF16 softmax, using approximate exponentiation «0.14% relative error) to boost softmax calculation. Integrated into a 12nm octa-core RISC-V cluster together with a matrix-multiplication systolic array, SoftEx reduces time and energy for attention probability computation by up to 10.8x and 26.8x, boosting MobileBERT throughput by 2.17x to 324 GOPS or 1.30 TOPS/W.
2025
Proceedings -Design, Automation and Test in Europe, DATE
1
2
Belano, A., Tortorella, Y., Garofalo, A., Benini, L., Rossi, D., Conti, F. (2025). SoftEx: A Low Power and Flexible Softmax Accelerator with Fast Approximate Exponentiation. Institute of Electrical and Electronics Engineers Inc. [10.23919/date64628.2025.10993043].
Belano, Andrea; Tortorella, Yvan; Garofalo, Angelo; Benini, Luca; Rossi, Davide; Conti, Francesco
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1040755
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact