Determining the correlation between biomedical terms is a powerful instrument to help scientist research activity, both to understand experimental results and to design new ones. In particular, a great potential comes from the integration of the many heterogeneous information sources currently available on the Web. In this article we focus on the correlation between genes and biological processes. In this context, we present a methodology for integrating information from biomedical literature with other heterogeneous types of structured information. In particular, the information sources integrated in this work are PubMed abstracts, pathway databases, and NCI thesaurus definitions. The integration is performed at the semantic analysis level using a customized approach we developed to modulate the impact of the different sources on the correlation score. We report the results of a study concerning the impact of the information integration on the correlation score and of the user-level parameters we introduced to modulate the impact of pathway data or NCI definitions with respect to biomedical literature information, depending on the context of the search. To evaluate the methodology, we performed correlation measures on six biological processes and nine genes by comparing the results with and without the integration of pathways and NCI definitions.

Integration of Literature with Heterogeneous Information for Genes Correlation Scoring

ACQUAVIVA, ANDREA;FICARRA, ELISA;
2013

Abstract

Determining the correlation between biomedical terms is a powerful instrument to help scientist research activity, both to understand experimental results and to design new ones. In particular, a great potential comes from the integration of the many heterogeneous information sources currently available on the Web. In this article we focus on the correlation between genes and biological processes. In this context, we present a methodology for integrating information from biomedical literature with other heterogeneous types of structured information. In particular, the information sources integrated in this work are PubMed abstracts, pathway databases, and NCI thesaurus definitions. The integration is performed at the semantic analysis level using a customized approach we developed to modulate the impact of the different sources on the correlation score. We report the results of a study concerning the impact of the information integration on the correlation score and of the user-level parameters we introduced to modulate the impact of pathway data or NCI definitions with respect to biomedical literature information, depending on the context of the search. To evaluate the methodology, we performed correlation measures on six biological processes and nine genes by comparing the results with and without the integration of pathways and NCI definitions.
ABATE, FRANCESCO; ACQUAVIVA, ANDREA; FICARRA, ELISA; MACII, Enrico
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/878311
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact