This paper introduces a new benchmark designed to evaluate the effective context length handled by Large Language Models (LLMs) in Italian. Following the structure of the five core tasks from the English BABILong dataset, we created an equivalent benchmark tailored for Italian. We used it to assess the context management capabilities of several prominent LLMs, both small and large, pretrained from scratch or fine-tuned specifically for Italian. Additionally, we tested a context extension technique called “SelfExtend” that does not require any training or fine-tuning phase, measuring its effectiveness using our proposed benchmark.
Tamburini, F. (2025). BABILong-ITA: a new benchmark for testing Large Language Models effective context length and a Context Extension Method. Aachen : CEUR Workshop Proceedings (CEUR-WS.org).
BABILong-ITA: a new benchmark for testing Large Language Models effective context length and a Context Extension Method
Tamburini Fabio
2025
Abstract
This paper introduces a new benchmark designed to evaluate the effective context length handled by Large Language Models (LLMs) in Italian. Following the structure of the five core tasks from the English BABILong dataset, we created an equivalent benchmark tailored for Italian. We used it to assess the context management capabilities of several prominent LLMs, both small and large, pretrained from scratch or fine-tuned specifically for Italian. Additionally, we tested a context extension technique called “SelfExtend” that does not require any training or fine-tuning phase, measuring its effectiveness using our proposed benchmark.| File | Dimensione | Formato | |
|---|---|---|---|
|
104_main_long.pdf
accesso aperto
Descrizione: Contributo in Atti di Convegno
Tipo:
Versione (PDF) editoriale / Version Of Record
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
1.82 MB
Formato
Adobe PDF
|
1.82 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


