This report contains preliminary results of the study aiming at automating legal evaluation of privacy policies, under the GDPR, using artificial intelligence (machine learning), in order to empower the civil society representing the interests of consumers. We outline what requirements a GDPR-compliant privacy policy should meet (comprehensive information, clear language, fair processing), as well as what are the ways in which these documents can be unlawful (if required information is insufficient, language unclear, or potentially unfair processing indicated). Further, we analyse the contents of privacy policies of Google, Facebook (and Instagram), Amazon, Apple, Microsoft, WhatsApp, Twitter, Uber, AirBnB, Booking.com, Skyscanner, Netflix, Steam and Epic Games. The experiments we conducted on these documents, using various machine learning techniques, lead us to the conclusion that this task can be, to a significant degree, realized by computers, if a sufficiently large data set is created. This, given the number of privacy policies online, is a task worth investing time and effort. Our study indicates that none of the analysed privacy policies meets the requirements of the GDPR. The evaluated corpus, comprising 3658 sentences (80.398 words) contains 401 sentences (11.0%) which we marked as containing unclear language, and 1240 sentences (33.9%) that we marked as potentially unlawful clause, i.e. either a "problematic processing” clause, or an “insufficient information” clause (under articles 13 and 14 of the GDPR). Hence, there is a significant room for improvement on the side of business, as well as for action on the side of consumer organizations and supervisory authorities.

CLAUDETTE meets GDPR: Automating the Evaluation of Privacy Policies using Artificial Intelligence / Giuseppe Contissa; Koen Docter; Francesca Lagioia; Marco Lippi; Hans-W. Micklitz; Przemyslaw Palka; Giovanni Sartor; Paolo Torroni. - ELETTRONICO. - (2018).

CLAUDETTE meets GDPR: Automating the Evaluation of Privacy Policies using Artificial Intelligence

Giuseppe Contissa;Francesca Lagioia;Giovanni Sartor;Paolo Torroni
2018

Abstract

This report contains preliminary results of the study aiming at automating legal evaluation of privacy policies, under the GDPR, using artificial intelligence (machine learning), in order to empower the civil society representing the interests of consumers. We outline what requirements a GDPR-compliant privacy policy should meet (comprehensive information, clear language, fair processing), as well as what are the ways in which these documents can be unlawful (if required information is insufficient, language unclear, or potentially unfair processing indicated). Further, we analyse the contents of privacy policies of Google, Facebook (and Instagram), Amazon, Apple, Microsoft, WhatsApp, Twitter, Uber, AirBnB, Booking.com, Skyscanner, Netflix, Steam and Epic Games. The experiments we conducted on these documents, using various machine learning techniques, lead us to the conclusion that this task can be, to a significant degree, realized by computers, if a sufficiently large data set is created. This, given the number of privacy policies online, is a task worth investing time and effort. Our study indicates that none of the analysed privacy policies meets the requirements of the GDPR. The evaluated corpus, comprising 3658 sentences (80.398 words) contains 401 sentences (11.0%) which we marked as containing unclear language, and 1240 sentences (33.9%) that we marked as potentially unlawful clause, i.e. either a "problematic processing” clause, or an “insufficient information” clause (under articles 13 and 14 of the GDPR). Hence, there is a significant room for improvement on the side of business, as well as for action on the side of consumer organizations and supervisory authorities.
2018
CLAUDETTE meets GDPR: Automating the Evaluation of Privacy Policies using Artificial Intelligence / Giuseppe Contissa; Koen Docter; Francesca Lagioia; Marco Lippi; Hans-W. Micklitz; Przemyslaw Palka; Giovanni Sartor; Paolo Torroni. - ELETTRONICO. - (2018).
Giuseppe Contissa; Koen Docter; Francesca Lagioia; Marco Lippi; Hans-W. Micklitz; Przemyslaw Palka; Giovanni Sartor; Paolo Torroni
File in questo prodotto:
File Dimensione Formato  
beuc-x-2018-066_claudette_meets_gdpr_report.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per accesso libero gratuito
Dimensione 1.31 MB
Formato Adobe PDF
1.31 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/663893
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact