In-memory computing (IMC) hardware accelerators for deep neural networks (DNNs) require storing a massive number of coefficients within a single computing macro to avoid performance degradation in multicore clusters. This aspect, often overlooked by common figures of merit (FoMs), can be effectively addressed by phase-change memory (PCM) technology, thanks to its high density, scalability, and analog non-volatile storage capability. This article presents a PCM-based (Ge-rich GST) analog IMC (AIMC) macro designed for multilayer, drift- and temperature-resilient computation. Fabricated in a 28-nm FD-SOI CMOS process and integrating a 4M-cell array, the accelerator achieves a matrix–vector multiplication (MVM) error lower than 2.14% across a wide temperature range (from −40°C to +125°C), yielding a 3.5× improvement over state-of-the- art solutions according to an FoM defined as No. of Weights×TOPS/W/mm2, a metric that reflects the achievable storage-energy efficiency per area during computation.
Pasotti, M., Zurla, R., Bertolini Agnoletto, J., Calvetti, E., Antolini, A., Lico, A., et al. (2026). A 28-nm FD-SOI CMOS Analog-IMC Core Based on PCM Featuring 8 512×512-Weight Layers and 28M Weights×TOPs/W/mm2. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 7, 1-15 [10.1109/JSSC.2026.3693894].
A 28-nm FD-SOI CMOS Analog-IMC Core Based on PCM Featuring 8 512×512-Weight Layers and 28M Weights×TOPs/W/mm2
Alessio Antolini;Andrea Lico;Francesco Zavalloni;Eleonora Franchi Scarselli;
2026
Abstract
In-memory computing (IMC) hardware accelerators for deep neural networks (DNNs) require storing a massive number of coefficients within a single computing macro to avoid performance degradation in multicore clusters. This aspect, often overlooked by common figures of merit (FoMs), can be effectively addressed by phase-change memory (PCM) technology, thanks to its high density, scalability, and analog non-volatile storage capability. This article presents a PCM-based (Ge-rich GST) analog IMC (AIMC) macro designed for multilayer, drift- and temperature-resilient computation. Fabricated in a 28-nm FD-SOI CMOS process and integrating a 4M-cell array, the accelerator achieves a matrix–vector multiplication (MVM) error lower than 2.14% across a wide temperature range (from −40°C to +125°C), yielding a 3.5× improvement over state-of-the- art solutions according to an FoM defined as No. of Weights×TOPS/W/mm2, a metric that reflects the achievable storage-energy efficiency per area during computation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



