Despite some improvements in compliance metrics after the implementation of the European General Data Protection Regulation (GDPR), privacy policies have become longer and more ambiguous. They often fail to fully meet GDPR requirements, thus leaving users without a reliable way to understand how their data is processed. We present a novel corpus composed by 30 privacy policies of online platforms and a new set of annotation guidelines, to assess the level of comprehensiveness of information. We focus on the processed categories of data, classifying each clause either as fully informative or as insufficiently informative. In our experimental evaluation, we perform 6 different classification and detection tasks, comparing BERT models and generative Large Language Models.

Grundler, G., Liepina, R., Musicco, M., Lagioia, F., Galassi, A., Sartor, G., et al. (2024). Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using BERT Models and LLMs [10.3233/faia241235].

Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using BERT Models and LLMs

Grundler, Giulia;Liepina, Ruta;Musicco, Mariaceleste;Lagioia, Francesca;Galassi, Andrea;Sartor, Giovanni;Torroni, Paolo
2024

Abstract

Despite some improvements in compliance metrics after the implementation of the European General Data Protection Regulation (GDPR), privacy policies have become longer and more ambiguous. They often fail to fully meet GDPR requirements, thus leaving users without a reliable way to understand how their data is processed. We present a novel corpus composed by 30 privacy policies of online platforms and a new set of annotation guidelines, to assess the level of comprehensiveness of information. We focus on the processed categories of data, classifying each clause either as fully informative or as insufficiently informative. In our experimental evaluation, we perform 6 different classification and detection tasks, comparing BERT models and generative Large Language Models.
2024
Legal Knowledge and Information Systems
72
83
Grundler, G., Liepina, R., Musicco, M., Lagioia, F., Galassi, A., Sartor, G., et al. (2024). Detecting Vague Clauses in Privacy Policies: The Analysis of Data Categories Using BERT Models and LLMs [10.3233/faia241235].
Grundler, Giulia; Liepina, Ruta; Musicco, Mariaceleste; Lagioia, Francesca; Galassi, Andrea; Sartor, Giovanni; Torroni, Paolo
File in questo prodotto:
File Dimensione Formato  
FAIA-395-FAIA241235.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale (CCBYNC)
Dimensione 287.37 kB
Formato Adobe PDF
287.37 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/998492
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact