We present the results of the participation of our team Unibo in the shared task sEXism Identification in Social neTworks (EXIST). We target all three tasks: a) binary sexism identification, b) discerning the author’s intention, and c) categorizing instances into fine-grained categories. For all the tasks, both English and Spanish data are to be considered. We compare two approaches to address this multilingual aspect: we employ machine translation to convert the Spanish data into English, allowing us to utilize a specially fine-tuned version of RoBERTa to detect hateful content, and we experiment with a multilingual version of RoBERTa to perform classification while preserving data in their original language. Furthermore, we predict emotions associated with each post and leverage them as additional features by concatenating them with the original text. This augmentation improves the performance of our models in Task 2 and 3. Our official submissions obtain F1=0.77 in Task 1 (13th position out of 69), macro-averaged F1=0.53 in Task 2 (4th position out of 35) and macro-averaged F1=0.59 in Task 3 (4th position out of 32).
Muti A., Mancini E. (2023). Enriching Hate-Tuned Transformer-Based Embeddings with Emotions for the Categorization of Sexism. CEUR-WS.
Enriching Hate-Tuned Transformer-Based Embeddings with Emotions for the Categorization of Sexism
Muti A.
;Mancini E.
2023
Abstract
We present the results of the participation of our team Unibo in the shared task sEXism Identification in Social neTworks (EXIST). We target all three tasks: a) binary sexism identification, b) discerning the author’s intention, and c) categorizing instances into fine-grained categories. For all the tasks, both English and Spanish data are to be considered. We compare two approaches to address this multilingual aspect: we employ machine translation to convert the Spanish data into English, allowing us to utilize a specially fine-tuned version of RoBERTa to detect hateful content, and we experiment with a multilingual version of RoBERTa to perform classification while preserving data in their original language. Furthermore, we predict emotions associated with each post and leverage them as additional features by concatenating them with the original text. This augmentation improves the performance of our models in Task 2 and 3. Our official submissions obtain F1=0.77 in Task 1 (13th position out of 69), macro-averaged F1=0.53 in Task 2 (4th position out of 35) and macro-averaged F1=0.59 in Task 3 (4th position out of 32).File | Dimensione | Formato | |
---|---|---|---|
paper-086.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
1.19 MB
Formato
Adobe PDF
|
1.19 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.