Proppy: Organizing the news based on their propagandistic content

Barron Cedeno, Luis Alberto; Jaradat, I.; Da San Martino, Giovanni; Nakov, P.

doi:10.1016/j.ipm.2019.03.005

Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.

Barron-Cedeno A., Jaradat I., Da San Martino G., Nakov P. (2019). Proppy: Organizing the news based on their propagandistic content. INFORMATION PROCESSING & MANAGEMENT, 56(5), 1849-1864 [10.1016/j.ipm.2019.03.005].

Proppy: Organizing the news based on their propagandistic content

BARRON CEDENO, LUIS ALBERTO;Jaradat I.;DA SAN MARTINO, GIOVANNI;Nakov P.

2019

Abstract

Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Rivista
	
				INFORMATION PROCESSING & MANAGEMENT
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.ipm.2019.03.005
			
	Citazione
	
				Barron-Cedeno A.,  Jaradat I.,  Da San Martino G.,  Nakov P. (2019). Proppy: Organizing the news based on their propagandistic content. INFORMATION PROCESSING & MANAGEMENT, 56(5), 1849-1864 [10.1016/j.ipm.2019.03.005].
			
	Tutti gli autori
	
						Barron-Cedeno A.; Jaradat I.; Da San Martino G.; Nakov P.
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/703554

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

177

119

ND

CRIS Current Research Information System