This paper addresses the problem of deployment of LLMs on RISC-V-based CPU systems by optimizing LLM inference on the Sophon SG2042. We evaluate the inference performance of two state-of-the-art LLMs optimised for reasoning: DeepSeek R1 Distill Llama 8B and DeepSeek R1 Distill QWEN 14B. Thanks to our optimizations on top of the llama.cpp inference library, we achieve token generation speeds of 4.32/2.29 tokens per second and prompt processing speeds of 6.54/3.68 tokens per second, with a significant speedup of up to 2.9 × /3.0 × compared to a direct porting of the same library.

Poveda Rodrigo, J.J., Hamdi, M.A., Koenig, C., Burrello, A., Jahier Pagliari, D., Benini, L. (2025). POSTER: V-Seek: Optimizing LLM Reasoning on A Server-Class General-Purpose RISC-V Platform [10.1145/3719276.3727954].

POSTER: V-Seek: Optimizing LLM Reasoning on A Server-Class General-Purpose RISC-V Platform

Burrello, Alessio;Jahier Pagliari, Daniele;Benini, Luca
2025

Abstract

This paper addresses the problem of deployment of LLMs on RISC-V-based CPU systems by optimizing LLM inference on the Sophon SG2042. We evaluate the inference performance of two state-of-the-art LLMs optimised for reasoning: DeepSeek R1 Distill Llama 8B and DeepSeek R1 Distill QWEN 14B. Thanks to our optimizations on top of the llama.cpp inference library, we achieve token generation speeds of 4.32/2.29 tokens per second and prompt processing speeds of 6.54/3.68 tokens per second, with a significant speedup of up to 2.9 × /3.0 × compared to a direct porting of the same library.
2025
CF '25: Proceedings of the 22nd ACM International Conference on Computing Frontiers
224
225
Poveda Rodrigo, J.J., Hamdi, M.A., Koenig, C., Burrello, A., Jahier Pagliari, D., Benini, L. (2025). POSTER: V-Seek: Optimizing LLM Reasoning on A Server-Class General-Purpose RISC-V Platform [10.1145/3719276.3727954].
Poveda Rodrigo, Javier Jesus; Hamdi, Mohamed Amine; Koenig, Cyril; Burrello, Alessio; Jahier Pagliari, Daniele; Benini, Luca
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1040753
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact