Near-threshold operation is today a key research area in Ultra-Low Power (ULP) computing, as it prom- ises a major boost in energy efficiency compared to super-threshold computing and it mitigates thermal bottlenecks. Unfortunately near-threshold operation is plagued by greatly increased sensitivity to thresh- old voltage variations, such as those caused by ambient temperature fluctuation. In this paper we focus on a tightly-coupled ULP processor cluster architecture where a low latency, high-bandwidth processor- to-L1-memory interconnection network plays a key role. We propose an architectural scheme to tolerate ambient temperature-induced variations capable of statically (off-line) and dynamically (on-line) adapt- ing the processor-to-L1-memory latency without compromising execution correctness. We extensively tested our solution in different scenarios and we evaluated the different design trade-offs, showing the cost, performance and reliability gain compared to state-of-the-art static solutions. The dynamic solution, thanks to its lightweight runtime overhead, outperforms the static solution and is able to reach a perfor mance gain up to 25% in a typical use case scenario with a very low (<4%) area overhead.
Bortolotti, D., Bartolini, A., Benini, L. (2014). An ultra-low power resilient multi-core architecture with static and dynamic tolerance to ambient temperature-induced variability. MICROPROCESSORS AND MICROSYSTEMS, 38(8), 776-787 [10.1016/j.micpro.2014.06.004].
An ultra-low power resilient multi-core architecture with static and dynamic tolerance to ambient temperature-induced variability
BORTOLOTTI, DANIELE;BARTOLINI, ANDREA;BENINI, LUCA
2014
Abstract
Near-threshold operation is today a key research area in Ultra-Low Power (ULP) computing, as it prom- ises a major boost in energy efficiency compared to super-threshold computing and it mitigates thermal bottlenecks. Unfortunately near-threshold operation is plagued by greatly increased sensitivity to thresh- old voltage variations, such as those caused by ambient temperature fluctuation. In this paper we focus on a tightly-coupled ULP processor cluster architecture where a low latency, high-bandwidth processor- to-L1-memory interconnection network plays a key role. We propose an architectural scheme to tolerate ambient temperature-induced variations capable of statically (off-line) and dynamically (on-line) adapt- ing the processor-to-L1-memory latency without compromising execution correctness. We extensively tested our solution in different scenarios and we evaluated the different design trade-offs, showing the cost, performance and reliability gain compared to state-of-the-art static solutions. The dynamic solution, thanks to its lightweight runtime overhead, outperforms the static solution and is able to reach a perfor mance gain up to 25% in a typical use case scenario with a very low (<4%) area overhead.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.