CRIS Current Research Information System

In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.

Romeo Salvatore, da San Martino G., Barron-Cedeno A., Moschitti A. (2017). A multiple-instance learning approach to sentence selection for question ranking. GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Verlag [10.1007/978-3-319-56608-5_34].

A multiple-instance learning approach to sentence selection for question ranking

Romeo Salvatore;da San Martino G.;Barron-Cedeno A.;Moschitti A.

2017

Abstract

In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Titolo del volume
	
				Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
			
	Pagina iniziale
	
				437
			
	Pagina finale
	
				449
			
	Collana/Serie
	
				LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-319-56608-5_34
			
	Citazione
	
				Romeo Salvatore,  da San Martino G.,  Barron-Cedeno A.,  Moschitti A. (2017). A multiple-instance learning approach to sentence selection for question ranking. GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND : Springer Verlag [10.1007/978-3-319-56608-5_34].
			
	Tutti gli autori
	
						Romeo Salvatore; da San Martino G.; Barron-Cedeno A.; Moschitti A.
					
	Appare nelle tipologie:
	
				4.01 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Romeo_et_al-2017-Advances_in_Information_Retrieval,_39th_European_Conference_on_IR_Research,_ECIR....pdf accesso riservato Tipo: Versione (PDF) editoriale Licenza: Licenza per accesso riservato Dimensione 425.21 kB Formato Adobe PDF Visualizza/Apri Contatta l'autore	425.21 kB	Adobe PDF	Visualizza/Apri Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/709205

Citazioni

ND

1

1

social impact