Extended and robust protein sequence annotation over conservative non hierarchical clusters.  The Bologna Annotation Resource v 2.0

Piovesan, Damiano; Bartoli, Lisa; Martelli, Pier Luigi; Fariselli, Piero; Rossi, Ivan; Guerzoni, G.; Donvito, G.; Maggi, G. P.; Casadio, Rita

Genome annotation is one of the most important issues in the genomic era. The exponential grow rate of newly sequenced genomes and proteomes urges the development of fast and reliable annotation methods, suited to exploit all the information available in curated data bases of protein sequences and structures. To this aim we developed BAR, the Bologna Annotation Resource that is now updated (available at http://microserf.biocomp.unibo.it/bar/). The basic notion is that sequences with high identity value to a counterpart can inherit the same function/s and structure, if available. What is totally new in our analysis is to cluster sequences with the constraint that sequence identity should be equal or higher than 40% on at least 90% of the pairwise alignment length. By this sequences are clustered in sets that can be annotated in terms of function and structure depending on the annotation level of the sequences within the cluster. Our method starts with on all-against-all alignment of all the sequences in a GRID environment. The alignments are then regarded as an undirected graph and after the clustering procedure that constrains both the sequence identity value and the alignment length, all the connected nodes (proteins) collapse into a single group (cluster). A cluster that incorporates a UniProt entry inherits its annotations (GO terms that are statistically validated, PDB structures, SCOP classifications, Pfam families, if available). Clusters can contain distantly related proteins that by this can be annotated with high confidence. Ultimately the method analyses a total of over 12 million protein sequences taken from 988 genomes and UniProt release 13. In this version HMM models of those clusters that contain PDB templates are also provided to the end-user for computing structural models of distantly related sequences.

Piovesan D., Bartoli L., Martelli P.L., Fariselli P., Rossi I., Guerzoni G., et al. (2010). Extended and robust protein sequence annotation over conservative non hierarchical clusters. The Bologna Annotation Resource v 2.0. s.l : s.n.

Extended and robust protein sequence annotation over conservative non hierarchical clusters. The Bologna Annotation Resource v 2.0

PIOVESAN, DAMIANO;BARTOLI, LISA;MARTELLI, PIER LUIGI;FARISELLI, PIERO;ROSSI, IVAN;Guerzoni G.;Donvito G.;Maggi G. P.;CASADIO, RITA

2010

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2010
			
	Titolo del volume
	
				Proceedings of 14th International Biotechnology Symposium and Exhibition
			
	Pagina iniziale
	
				S15
			
	Pagina finale
	
				S15
			
	Citazione
	
				Piovesan D.,  Bartoli L.,  Martelli P.L.,  Fariselli P.,  Rossi I.,  Guerzoni G., et al. (2010). Extended and robust protein sequence annotation over conservative non hierarchical clusters.  The Bologna Annotation Resource v 2.0. s.l : s.n.
			
	Tutti gli autori
	
						Piovesan D.; Bartoli L.; Martelli P.L.; Fariselli P.; Rossi I.; Guerzoni G.; Donvito G.; Maggi G.P.; Casadio R.
					
	Appare nelle tipologie:
	
				4.02 Riassunto (Abstract)

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/100506

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

CRIS Current Research Information System

Extended and robust protein sequence annotation over conservative non hierarchical clusters. The Bologna Annotation Resource v 2.0

PIOVESAN, DAMIANO;BARTOLI, LISA;MARTELLI, PIER LUIGI;FARISELLI, PIERO;ROSSI, IVAN;Guerzoni G.;Donvito G.;Maggi G. P.;CASADIO, RITA

2010

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Attenzione

Citazioni

social impact

CRIS Current Research Information System

Extended and robust protein sequence annotation over conservative non hierarchical clusters. The Bologna Annotation Resource v 2.0

PIOVESAN, DAMIANO;BARTOLI, LISA;MARTELLI, PIER LUIGI;FARISELLI, PIERO;ROSSI, IVAN;Guerzoni G.;Donvito G.;Maggi G. P.;CASADIO, RITA

2010

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Attenzione

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)