CRIS Current Research Information System

Natural Language Understanding and Generation models suffer from a limited capability of understanding the nuances of inclusive communication as they are trained on massive data, often including significant portions of non-inclusive content. Even when the models are specifically designed to address non-inclusive language detection or reformulation, they disregard, to a large extent, inclusivenessrelated features that are likely correlated with the inclusive language nuances, such as the discourse type, level of inclusiveness, and intended context of use. To assess the importance of additional inclusiveness-related features, we collect a new corpus of Italian administrative documents humanly annotated by linguistic experts. Linguistic experts not only highlight non-inclusive text snippets and propose possible reformulations, but also annotate multi-aspect labels related to different inclusive language nuances. We empirically show that a multi-task learning approach that leverages the multi-aspect annotations can improve the non-inclusive text reformulation performance, thereby confirming the potential of expert-annotated data in inclusive language processing.

La Quatra, M., Greco, S., Cagliero, L., Tonti, M., Dragotto, F., Raus, R., et al. (2024). Building Foundations for Inclusiveness through Expert-Annotated Data. Acquisgrana : CEUR-WS.

Building Foundations for Inclusiveness through Expert-Annotated Data

Moreno La Quatra;Salvatore Greco;Luca Cagliero;Michela Tonti;Francesca Dragotto;Rachele Raus;Stefania Cavagnoli;Tania Cerquitelli

2024

Abstract

Natural Language Understanding and Generation models suffer from a limited capability of understanding the nuances of inclusive communication as they are trained on massive data, often including significant portions of non-inclusive content. Even when the models are specifically designed to address non-inclusive language detection or reformulation, they disregard, to a large extent, inclusivenessrelated features that are likely correlated with the inclusive language nuances, such as the discourse type, level of inclusiveness, and intended context of use. To assess the importance of additional inclusiveness-related features, we collect a new corpus of Italian administrative documents humanly annotated by linguistic experts. Linguistic experts not only highlight non-inclusive text snippets and propose possible reformulations, but also annotate multi-aspect labels related to different inclusive language nuances. We empirically show that a multi-task learning approach that leverages the multi-aspect annotations can improve the non-inclusive text reformulation performance, thereby confirming the potential of expert-annotated data in inclusive language processing.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo del volume
	
				Proceedings of the Workshops of the EDBT/ICDT 2024 Joint Conference co-located with the EDBT/ICDT 2024 Joint Conference
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				5
			
	Collana/Serie
	
				CEUR WORKSHOP PROCEEDINGS
			
	Citazione
	
				La Quatra, M., Greco, S., Cagliero, L., Tonti, M., Dragotto, F., Raus, R., et al. (2024). Building Foundations for Inclusiveness through Expert-Annotated Data. Acquisgrana : CEUR-WS.
			
	Tutti gli autori
	
						La Quatra, Moreno; Greco, Salvatore; Cagliero, Luca; Tonti, Michela; Dragotto, Francesca; Raus, Rachele; Cavagnoli, Stefania; Cerquitelli, Tania...espandi
						
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
DARLI-AP-3.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 1.02 MB Formato Adobe PDF Visualizza/Apri	1.02 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/967347

Citazioni

ND

2

ND

ND

social impact