The evolution of multi- and many-core platforms is rapidly increasing the available on-chip computational capabilities of embedded computing devices, while memory access is dominated by on-chip and off-chip interconnect delays which do not scale well. For this reason, the bottleneck of many applications is rapidly moving from computation to communication. More precisely, performance is often bound by the huge latency of direct memory accesses. In this scenario the challenge is to provide embedded multi and many-core systems with a powerful, low-latency, energy efficient and flexible way to move data through the memory hierarchy level. In this paper, a DMA engine optimized for clustered tightly coupled many-core systems is presented. The IP features a simple micro-coded programming interface and lock-free per-core command queues to improve flexibility while reducing the programming latency. Moreover it dramatically reduces the area and improves the energy efficiency with respect to conventional DMAs exploiting the cluster shared memory as local repository for data buffers. The proposed DMA engine improves the access and programming latency by one order of magnitude, it reduces IP area by 4x and power by 5x, with respect to a conventional DMA, while providing full bandwidth to 16 independent logical channels.

Ultra-low-latency lightweight dma for tightly coupled multi-core clusters / Rossi, Davide; Loi, Igor; Haugou, Germain; Benini, Luca. - STAMPA. - (2014), pp. 15.1-15.10. (Intervento presentato al convegno 11th ACM International Conference on Computing Frontiers, CF 2014 tenutosi a Cagliari, ita nel 2014) [10.1145/2597917.2597922].

Ultra-low-latency lightweight dma for tightly coupled multi-core clusters

ROSSI, DAVIDE;LOI, IGOR;BENINI, LUCA
2014

Abstract

The evolution of multi- and many-core platforms is rapidly increasing the available on-chip computational capabilities of embedded computing devices, while memory access is dominated by on-chip and off-chip interconnect delays which do not scale well. For this reason, the bottleneck of many applications is rapidly moving from computation to communication. More precisely, performance is often bound by the huge latency of direct memory accesses. In this scenario the challenge is to provide embedded multi and many-core systems with a powerful, low-latency, energy efficient and flexible way to move data through the memory hierarchy level. In this paper, a DMA engine optimized for clustered tightly coupled many-core systems is presented. The IP features a simple micro-coded programming interface and lock-free per-core command queues to improve flexibility while reducing the programming latency. Moreover it dramatically reduces the area and improves the energy efficiency with respect to conventional DMAs exploiting the cluster shared memory as local repository for data buffers. The proposed DMA engine improves the access and programming latency by one order of magnitude, it reduces IP area by 4x and power by 5x, with respect to a conventional DMA, while providing full bandwidth to 16 independent logical channels.
2014
Proceedings of the 11th ACM Conference on Computing Frontiers, CF 2014
1
10
Ultra-low-latency lightweight dma for tightly coupled multi-core clusters / Rossi, Davide; Loi, Igor; Haugou, Germain; Benini, Luca. - STAMPA. - (2014), pp. 15.1-15.10. (Intervento presentato al convegno 11th ACM International Conference on Computing Frontiers, CF 2014 tenutosi a Cagliari, ita nel 2014) [10.1145/2597917.2597922].
Rossi, Davide; Loi, Igor; Haugou, Germain; Benini, Luca
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/525147
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 28
  • ???jsp.display-item.citation.isi??? ND
social impact