Propagandistic online content could be everywhere; e.g., social media, web forums, and news articles. Nonetheless, the vast majority of efforts to build computational models to automatically detect propaganda has been centered on news outlets, given that historically this space is where people used to attend to get informed. This has gradually changed. Today the Internet, and in particular some social media have become the main news and event spreaders worldwide. In this study, we explore the detection of propaganda in tweets. The originality of our contribution resides in the creation of PropitterX, a Twitter-based dataset that we extend by incorporating contextual information for each instance, thus allowing for the study not only of the contents within a tweet but also of the roles of different aspects, such as the political bias behind the post, its publication time, region of origin, and even its predominant emotion evoked. We present this corpus alongside four data sub-collections to show how different questions about propaganda detection could be posed to take advantage of this resource and further advance in this task.

Casavantes, M., Montes-y-Gómez, M., Hernández-Farías, D., González, L.C., Barrón-Cedeño, A. (2025). PropitterX: a Twitter-based propaganda corpus extended with multiple contextual features. LANGUAGE RESOURCES AND EVALUATION, 00, 1-26 [10.1007/s10579-025-09849-w].

PropitterX: a Twitter-based propaganda corpus extended with multiple contextual features

Barrón-Cedeño, Alberto
Ultimo
2025

Abstract

Propagandistic online content could be everywhere; e.g., social media, web forums, and news articles. Nonetheless, the vast majority of efforts to build computational models to automatically detect propaganda has been centered on news outlets, given that historically this space is where people used to attend to get informed. This has gradually changed. Today the Internet, and in particular some social media have become the main news and event spreaders worldwide. In this study, we explore the detection of propaganda in tweets. The originality of our contribution resides in the creation of PropitterX, a Twitter-based dataset that we extend by incorporating contextual information for each instance, thus allowing for the study not only of the contents within a tweet but also of the roles of different aspects, such as the political bias behind the post, its publication time, region of origin, and even its predominant emotion evoked. We present this corpus alongside four data sub-collections to show how different questions about propaganda detection could be posed to take advantage of this resource and further advance in this task.
2025
Casavantes, M., Montes-y-Gómez, M., Hernández-Farías, D., González, L.C., Barrón-Cedeño, A. (2025). PropitterX: a Twitter-based propaganda corpus extended with multiple contextual features. LANGUAGE RESOURCES AND EVALUATION, 00, 1-26 [10.1007/s10579-025-09849-w].
Casavantes, Marco; Montes-y-Gómez, Manuel; Hernández-Farías, Delia-Irazú; González, Luis C.; Barrón-Cedeño, Alberto...espandi
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1018232
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact