The performance of most digital systems today is limited by the interconnect latency between logic and memory, rather than by the performance of logic or memory itself. Threedimensional (3-D) integration using through-silicon-vias (TSVs) may provide a solution to overcome the scaling limitations by stacking multiple memory dies on top of a many-core die. In this paper, we propose a Mesh-of-Trees (MoT) network to support high-throughput and low-latency communication between processing cores and 3-D stacked multi-banked shared L2 data memory. Compared to conventional MoT network [5] that is straightforwardly adapted to 3-D integration, the experimental results show that the proposed network significantly improves the number of operations per second. We also investigate the architecture parameters of 3-D memory stacking (e.g., number of tiers to be stacked, TSV sharing, etc.) that affect the interconnection network as well as the system performance and fabrication cost, which permits to explore trade-offs among different 3-D memory stacking architectures.
K. Kang, L. Benini, G. De Micheli (2012). A High-throughput and Low-Latency Interconnection Network for Multi-Core Clusters with 3-D Stacked L2 Tightly-Coupled Data Memory. NEW YORK : IEEE Press [10.1109/VLSI-SoC.2012.6379047].
A High-throughput and Low-Latency Interconnection Network for Multi-Core Clusters with 3-D Stacked L2 Tightly-Coupled Data Memory
BENINI, LUCA;
2012
Abstract
The performance of most digital systems today is limited by the interconnect latency between logic and memory, rather than by the performance of logic or memory itself. Threedimensional (3-D) integration using through-silicon-vias (TSVs) may provide a solution to overcome the scaling limitations by stacking multiple memory dies on top of a many-core die. In this paper, we propose a Mesh-of-Trees (MoT) network to support high-throughput and low-latency communication between processing cores and 3-D stacked multi-banked shared L2 data memory. Compared to conventional MoT network [5] that is straightforwardly adapted to 3-D integration, the experimental results show that the proposed network significantly improves the number of operations per second. We also investigate the architecture parameters of 3-D memory stacking (e.g., number of tiers to be stacked, TSV sharing, etc.) that affect the interconnection network as well as the system performance and fabrication cost, which permits to explore trade-offs among different 3-D memory stacking architectures.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.