On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments

Barron-Cedeno, A.; Da San Martino, G.; Filice, S.; Moschitti, A.

doi:10.1145/3077136.3080763

In many Information Retrieval tasks the boundary between classes is not well defined and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? How should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose commentswere originally annotated with three classes and re-Annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.

Barron-Cedeno A., Da San Martino G., Filice S., Moschitti A. (2017). On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments. 1515 BROADWAY, NEW YORK, NY 10036-9998 USA : Association for Computing Machinery, Inc [10.1145/3077136.3080763].

On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments

Barron-Cedeno A.;Da San Martino G.;Filice S.;Moschitti A.

2017

Abstract

In many Information Retrieval tasks the boundary between classes is not well defined and assigning a document to a specific class may be complicated, even for humans. For instance, a document which is not directly related to the user's query may still contain relevant information. In this scenario, an option is to define an intermediate class collecting ambiguous instances. Yet some natural questions arise. Is this annotation strategy convenient? How should the intermediate class be treated? To answer these questions, we explored two community question answering datasets whose commentswere originally annotated with three classes and re-Annotated a subset of instances considering a binary good vs bad setting. Our main contribution is to show empirically that the inclusion of an intermediate class to assess Boolean relevance is not useful. Moreover, in case the data is already annotated with a 3-class strategy, the instances from the intermediate class can be safely removed at training time.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Titolo del volume
	
				SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
			
	Pagina iniziale
	
				1209
			
	Pagina finale
	
				1212
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3077136.3080763
			
	Citazione
	
				Barron-Cedeno A.,  Da San Martino G.,  Filice S.,  Moschitti A. (2017). On the use of an intermediate class in boolean crowdsourced relevance annotations for learning to rank comments. 1515 BROADWAY, NEW YORK, NY 10036-9998 USA : Association for Computing Machinery, Inc [10.1145/3077136.3080763].
			
	Tutti gli autori
	
						Barron-Cedeno A.; Da San Martino G.; Filice S.; Moschitti A.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/709170

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

2

2

CRIS Current Research Information System