In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.

A multiple-instance learning approach to sentence selection for question ranking / Romeo Salvatore; da San Martino G.; Barron-Cedeno A.; Moschitti A.. - ELETTRONICO. - 10193:(2017), pp. 437-449. (Intervento presentato al convegno 39th European Conference on Information Retrieval, ECIR 2017 tenutosi a gbr nel 2017) [10.1007/978-3-319-56608-5_34].

A multiple-instance learning approach to sentence selection for question ranking

da San Martino G.;Barron-Cedeno A.;
2017

Abstract

In example-based retrieval a system is queried with a document aiming to retrieve other similar or relevant documents. We address an instance of this problem: question retrieval in community Question Answering (cQA) forums. In this scenario, both the document collection and the queries are relatively short multi-sentence documents subject to noise and redundancy, which makes it harder for learning-to-rank algorithms to build upon the proper text representation. In order to only exploit the relevant fragments of the query and collection documents, we treat them as a sequence of sentences, in a multiple instance learning fashion. By automatically pre-selecting the best sentences for our tree-kernel-based learning model, we improve over using full text performance on the dataset of the 2016 SemEval cQA challenge in terms of accuracy and speed, reaching the state of the art.
2017
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
437
449
A multiple-instance learning approach to sentence selection for question ranking / Romeo Salvatore; da San Martino G.; Barron-Cedeno A.; Moschitti A.. - ELETTRONICO. - 10193:(2017), pp. 437-449. (Intervento presentato al convegno 39th European Conference on Information Retrieval, ECIR 2017 tenutosi a gbr nel 2017) [10.1007/978-3-319-56608-5_34].
Romeo Salvatore; da San Martino G.; Barron-Cedeno A.; Moschitti A.
File in questo prodotto:
File Dimensione Formato  
Romeo_et_al-2017-Advances_in_Information_Retrieval,_39th_European_Conference_on_IR_Research,_ECIR....pdf

accesso riservato

Tipo: Versione (PDF) editoriale
Licenza: Licenza per accesso riservato
Dimensione 425.21 kB
Formato Adobe PDF
425.21 kB Adobe PDF   Visualizza/Apri   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/709205
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact