This paper introduces a new benchmark designed to evaluate the effective context length handled by Large Language Models (LLMs) in Italian. Following the structure of the five core tasks from the English BABILong dataset, we created an equivalent benchmark tailored for Italian. We used it to assess the context management capabilities of several prominent LLMs, both small and large, pretrained from scratch or fine-tuned specifically for Italian. Additionally, we tested a context extension technique called “SelfExtend” that does not require any training or fine-tuning phase, measuring its effectiveness using our proposed benchmark.

Tamburini, F. (2025). BABILong-ITA: a new benchmark for testing Large Language Models effective context length and a Context Extension Method. Aachen : CEUR Workshop Proceedings (CEUR-WS.org).

BABILong-ITA: a new benchmark for testing Large Language Models effective context length and a Context Extension Method

Tamburini Fabio
2025

Abstract

This paper introduces a new benchmark designed to evaluate the effective context length handled by Large Language Models (LLMs) in Italian. Following the structure of the five core tasks from the English BABILong dataset, we created an equivalent benchmark tailored for Italian. We used it to assess the context management capabilities of several prominent LLMs, both small and large, pretrained from scratch or fine-tuned specifically for Italian. Additionally, we tested a context extension technique called “SelfExtend” that does not require any training or fine-tuning phase, measuring its effectiveness using our proposed benchmark.
2025
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
1
9
Tamburini, F. (2025). BABILong-ITA: a new benchmark for testing Large Language Models effective context length and a Context Extension Method. Aachen : CEUR Workshop Proceedings (CEUR-WS.org).
Tamburini, Fabio
File in questo prodotto:
File Dimensione Formato  
104_main_long.pdf

accesso aperto

Descrizione: Contributo in Atti di Convegno
Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 1.82 MB
Formato Adobe PDF
1.82 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1033539
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact