Historically, processor performance has increased at a much faster rate than that of main memory and up-coming NoC-based many-core architectures are further tightening the memory bottleneck. 3D integration based on TSV technology may provide a solution, as it enables stacking of multiple memory layers, with orders-of-magnitude increase in memory interface bandwidth, speed and energy efficiency. To fully exploit this potential, the architectural interface to vertically stacked memory must be streamlined. In this paper we present an efficient and flexible distributed memory interface for 3D-stacked DRAM. Our interface ensures ultra-low-latency access to the memory modules on top of each processing element (vertically local memory neighborhoods). Communication to these local modules do not travel through the NoC and takes full advantage of the lower latency of vertical interconnect, thus speeding up significantly the common case. The interface still supports a convenient global address space abstraction with high-latency remote access, due to the slower horizontal interconnect. Experimental results demonstrate significant bandwidth improvement that ranges from 1.44× to 7.40× as compared to the JEDEC standard, with peaks of 4.53 GB/s for direct memory access, and 850 MB/s for remote access through the NoC.

An efficient distributed memory interface for many-core platform with 3D stacked DRAM / Loi I. ; Benini L.. - STAMPA. - (2010), pp. 99-104. (Intervento presentato al convegno Design, Automation & Test in Europe Conference & Exhibition (DATE), 2010 tenutosi a Dresden nel 8-12 March 2010) [10.1109/DATE.2010.5457230].

An efficient distributed memory interface for many-core platform with 3D stacked DRAM

LOI, IGOR;BENINI, LUCA
2010

Abstract

Historically, processor performance has increased at a much faster rate than that of main memory and up-coming NoC-based many-core architectures are further tightening the memory bottleneck. 3D integration based on TSV technology may provide a solution, as it enables stacking of multiple memory layers, with orders-of-magnitude increase in memory interface bandwidth, speed and energy efficiency. To fully exploit this potential, the architectural interface to vertically stacked memory must be streamlined. In this paper we present an efficient and flexible distributed memory interface for 3D-stacked DRAM. Our interface ensures ultra-low-latency access to the memory modules on top of each processing element (vertically local memory neighborhoods). Communication to these local modules do not travel through the NoC and takes full advantage of the lower latency of vertical interconnect, thus speeding up significantly the common case. The interface still supports a convenient global address space abstraction with high-latency remote access, due to the slower horizontal interconnect. Experimental results demonstrate significant bandwidth improvement that ranges from 1.44× to 7.40× as compared to the JEDEC standard, with peaks of 4.53 GB/s for direct memory access, and 850 MB/s for remote access through the NoC.
2010
Design, Automation & Test in Europe Conference & Exhibition (DATE), 2010
99
104
An efficient distributed memory interface for many-core platform with 3D stacked DRAM / Loi I. ; Benini L.. - STAMPA. - (2010), pp. 99-104. (Intervento presentato al convegno Design, Automation & Test in Europe Conference & Exhibition (DATE), 2010 tenutosi a Dresden nel 8-12 March 2010) [10.1109/DATE.2010.5457230].
Loi I. ; Benini L.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/95320
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 41
  • ???jsp.display-item.citation.isi??? 18
social impact