CRIS Current Research Information System

Computational fact-checking (FC) relies on supervised models to verify claims based on given evidence, requiring a resource-intensive process to annotate large volumes of training data. We introduce Unown, a novel framework that generates training instances for FC systems automatically using both textual and tabular content. Unown selects relevant evidence and generates supporting and refuting claims with advanced negation artifacts. Designed to be flexible, Unown accommodates various strategies for evidence selection and claim generation, offering unparalleled adaptability. We comprehensively evaluate Unown on both text-only and table+text benchmarks, including Feverous, SciFact, and MMFC, a new multi-modal FC dataset. Our results prove that Unown examples are of comparable quality to expert-labeled data, even enabling models to achieve up to 5% higher accuracy. The code, data, and models are available at https://github.com/disi-unibo-nlp/unown

Bussotti, J., Ragazzi, L., Frisoni, G., Moro, G., Papotti, P. (2024). Unknown Claims: Generation of Fact-Checking Training Examples from Unstructured and Structured Data.

Unknown Claims: Generation of Fact-Checking Training Examples from Unstructured and Structured Data

Jean-Flavien Bussotti^Co-primo;Luca Ragazzi^Co-primo;Giacomo Frisoni^Co-primo;Gianluca Moro^Co-primo;Paolo Papotti^Co-primo

2024

Abstract

Computational fact-checking (FC) relies on supervised models to verify claims based on given evidence, requiring a resource-intensive process to annotate large volumes of training data. We introduce Unown, a novel framework that generates training instances for FC systems automatically using both textual and tabular content. Unown selects relevant evidence and generates supporting and refuting claims with advanced negation artifacts. Designed to be flexible, Unown accommodates various strategies for evidence selection and claim generation, offering unparalleled adaptability. We comprehensively evaluate Unown on both text-only and table+text benchmarks, including Feverous, SciFact, and MMFC, a new multi-modal FC dataset. Our results prove that Unown examples are of comparable quality to expert-labeled data, even enabling models to achieve up to 5% higher accuracy. The code, data, and models are available at https://github.com/disi-unibo-nlp/unown

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo del volume
	
				Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
			
	Pagina iniziale
	
				12105
			
	Pagina finale
	
				12122
			
	Citazione
	
				Bussotti, J., Ragazzi, L., Frisoni, G., Moro, G., Papotti, P. (2024). Unknown Claims: Generation of Fact-Checking Training Examples from Unstructured and Structured Data.
			
	Tutti gli autori
	
						Bussotti, Jean-Flavien; Ragazzi, Luca; Frisoni, Giacomo; Moro, Gianluca; Papotti, Paolo

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1007081

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

3

ND

social impact