Random perturbations of term weighted gene ontology annotations for discovering gene unknown functionalities

Domeniconi, Giacomo; Masseroli, Marco; Moro, Gianluca; Pinoli, Pietro

doi:10.1007/978-3-319-25840-9_12

Computational analyses for biomedical knowledge discovery greatly benefit from the availability of the description of gene and protein functional features expressed through controlled terminologies and ontologies, i.e. of their controlled annotations. In the last years, several databases of such annotations have become available; yet, these annotations are incomplete and only some of them represent highly reliable human curated information. To predict and discover unknown or missing annotations existing approaches use unsupervised learning algorithms. We propose a new learning method that allows applying supervised algorithms to unsupervised problems, achieving much better annotation predictions. This method, which we also extend from our preceding work with data weighting techniques, is based on the generation of artificial labeled training sets through random perturbations of original data. We tested it on nine Gene Ontology annotation datasets; obtained results demonstrate that our approach achieves good effectiveness in novel annotation prediction, outperforming state of the art unsupervised methods.

Domeniconi, G., Masseroli, M., Moro, G., Pinoli, P. (2015). Random perturbations of term weighted gene ontology annotations for discovering gene unknown functionalities. berlino : Springer Verlag [10.1007/978-3-319-25840-9_12].

Random perturbations of term weighted gene ontology annotations for discovering gene unknown functionalities

DOMENICONI, GIACOMO;Masseroli, Marco;MORO, GIANLUCA;Pinoli, Pietro

2015

Abstract

Computational analyses for biomedical knowledge discovery greatly benefit from the availability of the description of gene and protein functional features expressed through controlled terminologies and ontologies, i.e. of their controlled annotations. In the last years, several databases of such annotations have become available; yet, these annotations are incomplete and only some of them represent highly reliable human curated information. To predict and discover unknown or missing annotations existing approaches use unsupervised learning algorithms. We propose a new learning method that allows applying supervised algorithms to unsupervised problems, achieving much better annotation predictions. This method, which we also extend from our preceding work with data weighting techniques, is based on the generation of artificial labeled training sets through random perturbations of original data. We tested it on nine Gene Ontology annotation datasets; obtained results demonstrate that our approach achieves good effectiveness in novel annotation prediction, outperforming state of the art unsupervised methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2015
			
	Titolo del volume
	
				Communications in Computer and Information Science
			
	Pagina iniziale
	
				181
			
	Pagina finale
	
				197
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-319-25840-9_12
			
	Citazione
	
				Domeniconi, G., Masseroli, M., Moro, G., Pinoli, P. (2015). Random perturbations of term weighted gene ontology annotations for discovering gene unknown functionalities. berlino : Springer Verlag [10.1007/978-3-319-25840-9_12].
			
	Tutti gli autori
	
						Domeniconi, Giacomo; Masseroli, Marco; Moro, Gianluca; Pinoli, Pietro
					
	Appare nelle tipologie:
	
				2.01 Capitolo / saggio in libro

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/545248

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

5

1

CRIS Current Research Information System