CRIS Current Research Information System

Most of the existing natural language processing systems for legal texts are developed for the English language. Nevertheless, there are several application domains where multiple versions of the same documents are provided in different languages, especially inside the European Union. One notable example is given by Terms of Service (ToS). In this paper, we compare different approaches to the task of detecting potential unfair clauses in ToS across multiple languages. In particular, after developing an annotated corpus and a machine learning classifier for English, we consider and compare several strategies to extend the system to other languages: building a novel corpus and training a novel machine learning system for each language, from scratch; projecting annotations across documents in different languages, to avoid the creation of novel corpora; translating training documents while keeping the original annotations; translating queries at prediction time and relying on the English system only. An extended experimental evaluation conducted on a large, original dataset indicates that the time-consuming task of re-building a novel annotated corpus for each language can often be avoided with no significant degradation in terms of performance.

Galassi, A., Lagioia, F., Jabłonowska, A., Lippi, M. (2025). Unfair clause detection in terms of service across multiple languages. ARTIFICIAL INTELLIGENCE AND LAW, 33(3), 641-689 [10.1007/s10506-024-09398-7].

Unfair clause detection in terms of service across multiple languages

Galassi, Andrea^Co-primo;Lagioia, Francesca^Co-primo;Jabłonowska, Agnieszka;Lippi, Marco

2025

Abstract

Most of the existing natural language processing systems for legal texts are developed for the English language. Nevertheless, there are several application domains where multiple versions of the same documents are provided in different languages, especially inside the European Union. One notable example is given by Terms of Service (ToS). In this paper, we compare different approaches to the task of detecting potential unfair clauses in ToS across multiple languages. In particular, after developing an annotated corpus and a machine learning classifier for English, we consider and compare several strategies to extend the system to other languages: building a novel corpus and training a novel machine learning system for each language, from scratch; projecting annotations across documents in different languages, to avoid the creation of novel corpora; translating training documents while keeping the original annotations; translating queries at prediction time and relying on the English system only. An extended experimental evaluation conducted on a large, original dataset indicates that the time-consuming task of re-building a novel annotated corpus for each language can often be avoided with no significant degradation in terms of performance.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Rivista
	
				ARTIFICIAL INTELLIGENCE AND LAW
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s10506-024-09398-7
			
	Citazione
	
				Galassi, A., Lagioia, F., Jabłonowska, A., Lippi, M. (2025). Unfair clause detection in terms of service across multiple languages. ARTIFICIAL INTELLIGENCE AND LAW, 33(3), 641-689 [10.1007/s10506-024-09398-7].
			
	Tutti gli autori
	
						Galassi, Andrea; Lagioia, Francesca; Jabłonowska, Agnieszka; Lippi, Marco
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Unfair-clause-detection-in-terms-of-service-across-multiple-languages.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 1.03 MB Formato Adobe PDF Visualizza/Apri	1.03 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/967073

Citazioni

ND

12

11

14

social impact