CRIS Current Research Information System

Preserving diversity and inclusion is becoming a compelling need in both industry and academia. The ability to use appropriate forms of writing, speaking, and gestures is not widespread even in formal communications such as public calls, public announcements, official reports, and legal documents. The improper use of linguistic expressions can foment unacceptable forms of exclusion, stereotypes as well as forms of verbal vio- lence against minorities, including women. Furthermore, existing machine translation tools are not designed to generate inclusive content. The present paper investigates a joint effort of the research communities of linguistics and Deep Learning Natural Lan- guage Understanding in fighting against non-inclusive, prejudiced language forms. It presents a methodology aimed at tackling the improper use of language in formal communication, with a particular attention paid to Romanic languages (Italian, in particular). State-of-the-art Deep Language Modeling architec- tures are exploited to automatically identify non-inclusive text snippets, suggest alternative forms, and produce inclusive text rephrasing. A preliminary evaluation conducted on a benchmark dataset shows promising results, i.e., 85% accuracy in predicting inclusive/non-inclusive communications. Index Terms—Inclusive Language, Gender Equality, Natural Language Processing, Deep Learning.

Rachele Raus, Michela Tonti, Tania Cerquitelli, Salvatore Greco, Moreno La Quatra, Giuseppe Attanasio, et al. (2021). E-MIMIC: Empowering Multilingual Inclusive Communication. Piscataway, New Jersey : IEEE [10.1109/BigData52589.2021.9671868].

E-MIMIC: Empowering Multilingual Inclusive Communication

Rachele Raus;Michela Tonti;Tania Cerquitelli;Salvatore Greco;Moreno La Quatra;Giuseppe Attanasio;Luca Cagliero

2021

Abstract

Preserving diversity and inclusion is becoming a compelling need in both industry and academia. The ability to use appropriate forms of writing, speaking, and gestures is not widespread even in formal communications such as public calls, public announcements, official reports, and legal documents. The improper use of linguistic expressions can foment unacceptable forms of exclusion, stereotypes as well as forms of verbal vio- lence against minorities, including women. Furthermore, existing machine translation tools are not designed to generate inclusive content. The present paper investigates a joint effort of the research communities of linguistics and Deep Learning Natural Lan- guage Understanding in fighting against non-inclusive, prejudiced language forms. It presents a methodology aimed at tackling the improper use of language in formal communication, with a particular attention paid to Romanic languages (Italian, in particular). State-of-the-art Deep Language Modeling architec- tures are exploited to automatically identify non-inclusive text snippets, suggest alternative forms, and produce inclusive text rephrasing. A preliminary evaluation conducted on a benchmark dataset shows promising results, i.e., 85% accuracy in predicting inclusive/non-inclusive communications. Index Terms—Inclusive Language, Gender Equality, Natural Language Processing, Deep Learning.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo del volume
	
				2021 IEEE International Conference on Big Data (Big Data)
			
	Pagina iniziale
	
				4227
			
	Pagina finale
	
				4234
			
	Collana/Serie
	
				... IEEE INTERNATIONAL CONFERENCE ON BIG DATA
			
	Codice DOI
	
				https://dx.doi.org/10.1109/BigData52589.2021.9671868
			
	Citazione
	
				Rachele Raus,  Michela Tonti,  Tania Cerquitelli,  Salvatore Greco,  Moreno La Quatra,  Giuseppe Attanasio, et al. (2021). E-MIMIC: Empowering Multilingual Inclusive Communication. Piscataway, New Jersey : IEEE [10.1109/BigData52589.2021.9671868].
			
	Tutti gli autori
	
						Rachele Raus; Michela Tonti; Tania Cerquitelli; Salvatore Greco; Moreno La Quatra; Giuseppe Attanasio; Luca Cagliero
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
E-MIMIC.pdf Open Access dal 14/01/2024 Tipo: Postprint / Author's Accepted Manuscript (AAM) - versione accettata per la pubblicazione dopo la peer-review Licenza: Licenza per accesso libero gratuito Dimensione 271.45 kB Formato Adobe PDF Visualizza/Apri	271.45 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/853726

Citazioni

ND

11

2

social impact