This paper explores advancements in Natural Language Processing (NLP) for legal lay summarization by systematically analyzing existing methodologies, datasets, and research findings. We review current literature, highlighting key challenges such as data scarcity and the complexity of legal language. A primary contribution of this study is the development of LegalEase, a specialized dataset designed to improve model training for summarizing legal documents in layman’s terms. Our findings demonstrate that subdomain-specific datasets within the legal domain outperform general legal datasets in enhancing NLP model performance for generating accurate and comprehensible legal summaries. The insights and methodologies presented provide a foundation for future research in legal lay summarization.

Moro, G., Magnani, L.D.M., Ragazzi, L. (2026). Legal Lay Summarization: Exploring Methods and Data Generation with Large Language Models. ARTIFICIAL INTELLIGENCE REVIEW, 59(1), 1-30 [10.1007/s10462-025-11392-7].

Legal Lay Summarization: Exploring Methods and Data Generation with Large Language Models

Gianluca Moro
Co-primo
;
Leonardo David Matteo Magnani
Co-primo
;
Luca Ragazzi
Co-primo
2026

Abstract

This paper explores advancements in Natural Language Processing (NLP) for legal lay summarization by systematically analyzing existing methodologies, datasets, and research findings. We review current literature, highlighting key challenges such as data scarcity and the complexity of legal language. A primary contribution of this study is the development of LegalEase, a specialized dataset designed to improve model training for summarizing legal documents in layman’s terms. Our findings demonstrate that subdomain-specific datasets within the legal domain outperform general legal datasets in enhancing NLP model performance for generating accurate and comprehensible legal summaries. The insights and methodologies presented provide a foundation for future research in legal lay summarization.
2026
Moro, G., Magnani, L.D.M., Ragazzi, L. (2026). Legal Lay Summarization: Exploring Methods and Data Generation with Large Language Models. ARTIFICIAL INTELLIGENCE REVIEW, 59(1), 1-30 [10.1007/s10462-025-11392-7].
Moro, Gianluca; Magnani, Leonardo David Matteo; Ragazzi, Luca
File in questo prodotto:
File Dimensione Formato  
s10462-025-11392-7 (1).pdf

accesso aperto

Tipo: Versione (PDF) editoriale / Version Of Record
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 5.88 MB
Formato Adobe PDF
5.88 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1027357
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact