CRIS Current Research Information System

This paper addresses the problem of deployment of LLMs on RISC-V-based CPU systems by optimizing LLM inference on the Sophon SG2042. We evaluate the inference performance of two state-of-the-art LLMs optimised for reasoning: DeepSeek R1 Distill Llama 8B and DeepSeek R1 Distill QWEN 14B. Thanks to our optimizations on top of the llama.cpp inference library, we achieve token generation speeds of 4.32/2.29 tokens per second and prompt processing speeds of 6.54/3.68 tokens per second, with a significant speedup of up to 2.9 × /3.0 × compared to a direct porting of the same library.

Poveda Rodrigo, J.J., Hamdi, M.A., Koenig, C., Burrello, A., Jahier Pagliari, D., Benini, L. (2025). POSTER: V-Seek: Optimizing LLM Reasoning on A Server-Class General-Purpose RISC-V Platform [10.1145/3719276.3727954].

POSTER: V-Seek: Optimizing LLM Reasoning on A Server-Class General-Purpose RISC-V Platform

Poveda Rodrigo, Javier Jesus;Hamdi, Mohamed Amine;Koenig, Cyril;Burrello, Alessio;Jahier Pagliari, Daniele;Benini, Luca

2025

Abstract

This paper addresses the problem of deployment of LLMs on RISC-V-based CPU systems by optimizing LLM inference on the Sophon SG2042. We evaluate the inference performance of two state-of-the-art LLMs optimised for reasoning: DeepSeek R1 Distill Llama 8B and DeepSeek R1 Distill QWEN 14B. Thanks to our optimizations on top of the llama.cpp inference library, we achieve token generation speeds of 4.32/2.29 tokens per second and prompt processing speeds of 6.54/3.68 tokens per second, with a significant speedup of up to 2.9 × /3.0 × compared to a direct porting of the same library.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del volume
	
				CF '25: Proceedings of the 22nd ACM International Conference on Computing Frontiers
			
	Pagina iniziale
	
				224
			
	Pagina finale
	
				225
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3719276.3727954
			
	Citazione
	
				Poveda Rodrigo, J.J., Hamdi, M.A., Koenig, C., Burrello, A., Jahier Pagliari, D., Benini, L. (2025). POSTER: V-Seek: Optimizing LLM Reasoning on A Server-Class General-Purpose RISC-V Platform [10.1145/3719276.3727954].
			
	Tutti gli autori
	
						Poveda Rodrigo, Javier Jesus; Hamdi, Mohamed Amine; Koenig, Cyril; Burrello, Alessio; Jahier Pagliari, Daniele; Benini, Luca

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1040753

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

ND

social impact