Despite the number of approaches recently proposed in NLP for detecting abusive language on social networks , the issue of developing hate speech detection systems that are robust across different platforms is still an unsolved problem. In this paper we perform a comparative evaluation on datasets for hate speech detection in Italian, extracted from four different social media platforms, i.e. Facebook, Twitter, Instagram and What-sApp. We show that combining such platform-dependent datasets to take advantage of training data developed for other platforms is beneficial, although their impact varies depending on the social network under consideration.
Nonostante si osservi un cre-scente interesse per approcci che identi-fichino il linguaggio offensivo sui social network attraverso l'NLP, la necessità di sviluppare sistemi che mantengano una buona performance anche su piattaforme diverseè ancora un tema di ricerca aper-to. In questo contributo presentiamo una valutazione comparativa su dataset per l'identificazione di linguaggio d'odio pro-venienti da quattro diverse piattaforme: Facebook, Twitter, Instagram and Wha-tsApp. Lo studio dimostra che, combinan-do dataset diversi per aumentare i dati di training, migliora le performance di clas-sificazione, anche se l'impatto varia a se-conda della piattaforma considerata.
Michele Corazza, S.M. (2019). Cross-Platform Evaluation for Italian Hate Speech Detection.
Cross-Platform Evaluation for Italian Hate Speech Detection
Michele Corazza;
2019
Abstract
Despite the number of approaches recently proposed in NLP for detecting abusive language on social networks , the issue of developing hate speech detection systems that are robust across different platforms is still an unsolved problem. In this paper we perform a comparative evaluation on datasets for hate speech detection in Italian, extracted from four different social media platforms, i.e. Facebook, Twitter, Instagram and What-sApp. We show that combining such platform-dependent datasets to take advantage of training data developed for other platforms is beneficial, although their impact varies depending on the social network under consideration.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.