We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages.
Kasper Drawzeski, A.G. (2021). A Corpus for Multilingual Analysis of Online Terms of Service. Punta Cana : Association for Computational Linguistics [10.18653/v1/2021.nllp-1.1].
A Corpus for Multilingual Analysis of Online Terms of Service
Andrea Galassi
;Francesca Lagioia
;Giovanni Sartor;Paolo Torroni
2021
Abstract
We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages.File | Dimensione | Formato | |
---|---|---|---|
A Corpus for Multilingual Analysis of Online Terms of Service.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
170.44 kB
Formato
Adobe PDF
|
170.44 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.