Accelerator-rich System-on-Chips (SoCs) combine general-purpose processors with Domain-Specific Accelerators (DSAs). The former provide flexibility and full support of legacy SW, while DSAs achieve high performance and energy efficiency through application-specific specialization. As DSAs are reused across various SoCs, they are evaluated in isolation with assumptions on the effects originating from their system-level integration. Our contributions include an open-source System-Level Design (SLD) methodology for fast integration and prototyping of accelerator-rich SoCs, which can easily capture behaviours emerging after an imprudent usage of platform resources. We evaluate three integration scenarios, where memory- and compute-bound DSAs are integrated into a RISC-V-based cluster and interact with a SW-managed, multi-banked shared-memory subsystem. Results stress how a system-level conscious usage of memory resources can attain the nominal system bandwidth, thus denoting our approach effectiveness for the integration of accelerator-rich workloads.
Bellocchi, G., Capotondi, A., Benini, L., Marongiu, A. (2025). Enabling Fast System-Level Integration and Prototyping of Accelerator-Rich Platforms [10.1145/3706594.3726975].
Enabling Fast System-Level Integration and Prototyping of Accelerator-Rich Platforms
Capotondi, Alessandro;Benini, Luca;Marongiu, Andrea
2025
Abstract
Accelerator-rich System-on-Chips (SoCs) combine general-purpose processors with Domain-Specific Accelerators (DSAs). The former provide flexibility and full support of legacy SW, while DSAs achieve high performance and energy efficiency through application-specific specialization. As DSAs are reused across various SoCs, they are evaluated in isolation with assumptions on the effects originating from their system-level integration. Our contributions include an open-source System-Level Design (SLD) methodology for fast integration and prototyping of accelerator-rich SoCs, which can easily capture behaviours emerging after an imprudent usage of platform resources. We evaluate three integration scenarios, where memory- and compute-bound DSAs are integrated into a RISC-V-based cluster and interact with a SW-managed, multi-banked shared-memory subsystem. Results stress how a system-level conscious usage of memory resources can attain the nominal system bandwidth, thus denoting our approach effectiveness for the integration of accelerator-rich workloads.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



