Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts. To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology. We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union. Our evaluation with human experts confirms that our method is more robust than the alternatives.

Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts / Sezen Perçin, Andrea Galassi, Francesca Lagioia, Federico Ruggeri, Piera Santin, Giovanni Sartor, Paolo Torroni. - ELETTRONICO. - (2022), pp. 47-52. (Intervento presentato al convegno Natural Legal Language Processing Workshop tenutosi a Abu Dhabi, UAE nel December 8, 2022).

Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts

Andrea Galassi
;
Francesca Lagioia;Federico Ruggeri;Piera Santin;Giovanni Sartor;Paolo Torroni
2022

Abstract

Creating balanced labeled textual corpora for complex tasks, like legal analysis, is a challenging and expensive process that often requires the collaboration of domain experts. To address this problem, we propose a data augmentation method based on the combination of GloVe word embeddings and the WordNet ontology. We present an example of application in the legal domain, specifically on decisions of the Court of Justice of the European Union. Our evaluation with human experts confirms that our method is more robust than the alternatives.
2022
Proceedings of the Natural Legal Language Processing Workshop 2022
47
52
Combining WordNet and Word Embeddings in Data Augmentation for Legal Texts / Sezen Perçin, Andrea Galassi, Francesca Lagioia, Federico Ruggeri, Piera Santin, Giovanni Sartor, Paolo Torroni. - ELETTRONICO. - (2022), pp. 47-52. (Intervento presentato al convegno Natural Legal Language Processing Workshop tenutosi a Abu Dhabi, UAE nel December 8, 2022).
Sezen Perçin, Andrea Galassi, Francesca Lagioia, Federico Ruggeri, Piera Santin, Giovanni Sartor, Paolo Torroni
File in questo prodotto:
File Dimensione Formato  
_NLLP2022__Data_augmentation_for_Maxims.pdf

accesso aperto

Tipo: Preprint
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 158.75 kB
Formato Adobe PDF
158.75 kB Adobe PDF Visualizza/Apri
Combining-WordNet-and-Word-Embeddings-in-Data-Augmentation-for-Legal-Texts.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 159.21 kB
Formato Adobe PDF
159.21 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/905768
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact