Always-on image processing is crucial for many applications, such as face and attention detection, and it is usually offloaded to dedicated, energy-efficient image processors. These processors need to be flexible and scalable to follow the rapid evolution of image sensors and always-on image processing workloads. A flexible architecture is the shared memory cluster, where multiple cores are tightly coupled with L1 memory. However, current clusters are not latency tolerant and follow a uniform memory access approach, which limits their frequency and scalability. The MemPool architecture [1] lifts those constraints by combining latency-tolerant cores, pipelined functional processing units, and a non-uniform memory access interconnect. This paper presents MinPool, a low-power image processor for always-on functions implemented in TSMC’s 65 nm technology and based on a tailored MemPool architecture. Thanks to an instruction set architecture extension tuned for image processing and the low-leakage process, it achieves excellent utilization results with IPCs of up to 0.98 and an energy efficiency of 65 GOPS/W for key image processing kernels.

Riedel, S., Cavalcante, M., Frouzakis, M., Wüthrich, D., Mustafa, E., Billa, A., et al. (2023). MinPool: A 16-core NUMA-L1 Memory RISC-V Processor Cluster for Always-on Image Processing in 65nm CMOS [10.1109/ICECS58634.2023.10382925].

MinPool: A 16-core NUMA-L1 Memory RISC-V Processor Cluster for Always-on Image Processing in 65nm CMOS

Benini, Luca
2023

Abstract

Always-on image processing is crucial for many applications, such as face and attention detection, and it is usually offloaded to dedicated, energy-efficient image processors. These processors need to be flexible and scalable to follow the rapid evolution of image sensors and always-on image processing workloads. A flexible architecture is the shared memory cluster, where multiple cores are tightly coupled with L1 memory. However, current clusters are not latency tolerant and follow a uniform memory access approach, which limits their frequency and scalability. The MemPool architecture [1] lifts those constraints by combining latency-tolerant cores, pipelined functional processing units, and a non-uniform memory access interconnect. This paper presents MinPool, a low-power image processor for always-on functions implemented in TSMC’s 65 nm technology and based on a tailored MemPool architecture. Thanks to an instruction set architecture extension tuned for image processing and the low-leakage process, it achieves excellent utilization results with IPCs of up to 0.98 and an energy efficiency of 65 GOPS/W for key image processing kernels.
2023
2023 30th IEEE International Conference on Electronics, Circuits and Systems (ICECS)
1
4
Riedel, S., Cavalcante, M., Frouzakis, M., Wüthrich, D., Mustafa, E., Billa, A., et al. (2023). MinPool: A 16-core NUMA-L1 Memory RISC-V Processor Cluster for Always-on Image Processing in 65nm CMOS [10.1109/ICECS58634.2023.10382925].
Riedel, Samuel; Cavalcante, Matheus; Frouzakis, Manos; Wüthrich, Domenic; Mustafa, Enis; Billa, Arlind; Benini, Luca
File in questo prodotto:
File Dimensione Formato  
CameraReady_IEEE.pdf

accesso aperto

Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review
Licenza: Licenza per accesso libero gratuito
Dimensione 955.7 kB
Formato Adobe PDF
955.7 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/958741
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact