When compared to traditional floating point (FP) number representation, logarithmic number systems (LNS) have superior performance when evaluating complex functions, since multiplications and divisions can be calculated with ease in the logarithmic domain. However, additions and subtractions become costly nonlinear operations. Efficient LNS units (LNUs) implementing ADD/SUB operations in hardware rely on interpolation techniques to save area. Even the most advanced LNUs are still larger than standard single-precision FPUs - which renders them impractical for most general purpose processors. In this paper, we show that in a multi-core setting, when shared among several processor cores, LNUs become a very attractive solution. We present a methodology to generate LNUs with various error bounds and perform a design space exploration with different parameterizations. We show that already small precision relaxations in the order of a few units in the last place (ulp) reduce the LNU area significantly. Using examples from several signal processing domains, we demonstrate that shared approximate LNUs can outperform their standard FP counterpart on average by 2.14x in speed and 1.92x in energy-efficiency, with insignificant degradation of the output quality.

Accuracy and Performance Trade-Offs of Logarithmic Number Units in Multi-Core Clusters / Schaffner, Michael; Gautschi, Michael; Gürkaynak, Frank K.; Benini, Luca. - STAMPA. - 2016-:(2016), pp. 7563277.95-7563277.103. (Intervento presentato al convegno 23rd IEEE Symposium on Computer Arithmetic, ARITH 2016 tenutosi a usa nel 2016) [10.1109/ARITH.2016.10].

Accuracy and Performance Trade-Offs of Logarithmic Number Units in Multi-Core Clusters

BENINI, LUCA
2016

Abstract

When compared to traditional floating point (FP) number representation, logarithmic number systems (LNS) have superior performance when evaluating complex functions, since multiplications and divisions can be calculated with ease in the logarithmic domain. However, additions and subtractions become costly nonlinear operations. Efficient LNS units (LNUs) implementing ADD/SUB operations in hardware rely on interpolation techniques to save area. Even the most advanced LNUs are still larger than standard single-precision FPUs - which renders them impractical for most general purpose processors. In this paper, we show that in a multi-core setting, when shared among several processor cores, LNUs become a very attractive solution. We present a methodology to generate LNUs with various error bounds and perform a design space exploration with different parameterizations. We show that already small precision relaxations in the order of a few units in the last place (ulp) reduce the LNU area significantly. Using examples from several signal processing domains, we demonstrate that shared approximate LNUs can outperform their standard FP counterpart on average by 2.14x in speed and 1.92x in energy-efficiency, with insignificant degradation of the output quality.
2016
Proceedings - Symposium on Computer Arithmetic
95
103
Accuracy and Performance Trade-Offs of Logarithmic Number Units in Multi-Core Clusters / Schaffner, Michael; Gautschi, Michael; Gürkaynak, Frank K.; Benini, Luca. - STAMPA. - 2016-:(2016), pp. 7563277.95-7563277.103. (Intervento presentato al convegno 23rd IEEE Symposium on Computer Arithmetic, ARITH 2016 tenutosi a usa nel 2016) [10.1109/ARITH.2016.10].
Schaffner, Michael; Gautschi, Michael; Gürkaynak, Frank K.; Benini, Luca
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/588294
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact