This study presents an interdisciplinary methodology for detecting biblicalreferences in Latin patristic literature through aninnovativecombination of rigorous philological approach and Natural Language Processing (NLP)techniques. Focusing onone of the most influential ancient Christian commentaries on the Bible,Augustine of Hippo’s De Genesi ad litteram,and its relationship withLatin biblical texts(specifically,Jerome’s Vulgate and pre-Vulgateversions), this research introduces a token-based classification systemfor intertextual references,enriched with semantic annotationsandsupported by the INCEpTION platform. The first section shows how thisnumerical classification system accounts for exact matches, lemmatized forms, roots, synonyms, and other forms of semanticparallels(here referred to as “structures”),capturing a wide spectrum of textual similarity. To enhance automatic retrieval of these intertextual connections, we fine-tune BERT-based language models for Latin, incorporating contrastivelearning and hard negative mining. In the second section, experimental results showthat fine-tuned modelssignificantly outperform baselinemodelsatvariouslevels of textual similarity. This work highlights the utility of computational models in overcoming the traditional dichotomy between explicit quotationsand implicit allusions, embracing multiple intermediate nuancesof similarityand offering a scalable approach tothe study of intertextuality in ancient writings.
Dainese, D., Mambelli, A., Bigoni, L., Al., E.t. (2026). The Biblical Heritage in Ancient Latin Christian Literature: Advancing Intertextual Mapping Through Sentence Embeddings. UMANISTICA DIGITALE, 22, 157-186 [10.60923/issn.2532-8816/22160].
The Biblical Heritage in Ancient Latin Christian Literature: Advancing Intertextual Mapping Through Sentence Embeddings
Davide Dainese;
2026
Abstract
This study presents an interdisciplinary methodology for detecting biblicalreferences in Latin patristic literature through aninnovativecombination of rigorous philological approach and Natural Language Processing (NLP)techniques. Focusing onone of the most influential ancient Christian commentaries on the Bible,Augustine of Hippo’s De Genesi ad litteram,and its relationship withLatin biblical texts(specifically,Jerome’s Vulgate and pre-Vulgateversions), this research introduces a token-based classification systemfor intertextual references,enriched with semantic annotationsandsupported by the INCEpTION platform. The first section shows how thisnumerical classification system accounts for exact matches, lemmatized forms, roots, synonyms, and other forms of semanticparallels(here referred to as “structures”),capturing a wide spectrum of textual similarity. To enhance automatic retrieval of these intertextual connections, we fine-tune BERT-based language models for Latin, incorporating contrastivelearning and hard negative mining. In the second section, experimental results showthat fine-tuned modelssignificantly outperform baselinemodelsatvariouslevels of textual similarity. This work highlights the utility of computational models in overcoming the traditional dichotomy between explicit quotationsand implicit allusions, embracing multiple intermediate nuancesof similarityand offering a scalable approach tothe study of intertextuality in ancient writings.| File | Dimensione | Formato | |
|---|---|---|---|
|
ESTRATTO UD 2026.pdf
accesso aperto
Descrizione: Articolo
Tipo:
Versione (PDF) editoriale / Version Of Record
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
1.05 MB
Formato
Adobe PDF
|
1.05 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


