In the "omic" era hundreds of genomes are available for protein sequence analysis, and we may estimate that some 30% of all the sequences are of membrane proteins. Differently from globular proteins, a three-dimensional model for membrane proteins can hardly be computed starting from sequence. Why is it so? What can we really compute and with what reliability? Can we build models of membrane proteins based on threading techniques? These issues are addressed with approaches that may be different by those generally adopted when globular proteins are predicted, and their solution often requires an expert-driven methodology. So the question is then how many methods do we have to integrate for getting a successful prediction of a membrane protein? Our group have successfully addressed modelling of 3D structure of some membrane proteins starting from the sequence chain and validated with ad hoc wet experiments, including side directed mutagenesis, fluorescence spectroscopy or gene expression. Another major problem in large-sequence projects is the annotation of those genes which have no counterpart in the database of presently known sequences with a given function. And then we may ask: can we contribute to the annotation process with predictive methods? Recently we set within a European project (Biosapiens) we asked the question of how many membrane proteins are present in the human genome. With a method that relies on the integration of different predictors of membrane topology we found that out of 32001 unique sequences of the Ensemble release 35a1 (December 2004), 32.3 and 31.2% are predicted as membrane proteins by MEMSAT and ENSEMBLE, respectively, (25% are predicted by both predictors). 19.6% of the sequences are predicted as membrane proteins by TMHMM2.0 (single sequenced-based), and 19% are predicted by all predictors. These results set the lower and upper bound for the membrane protein content of the human genome and allow a list of putative membrane proteins for further applications.

Bioinformatics and Membrane Proteins: is it feasible to predict the 3D structure of a membrane protein?

CASADIO, RITA
2006

Abstract

In the "omic" era hundreds of genomes are available for protein sequence analysis, and we may estimate that some 30% of all the sequences are of membrane proteins. Differently from globular proteins, a three-dimensional model for membrane proteins can hardly be computed starting from sequence. Why is it so? What can we really compute and with what reliability? Can we build models of membrane proteins based on threading techniques? These issues are addressed with approaches that may be different by those generally adopted when globular proteins are predicted, and their solution often requires an expert-driven methodology. So the question is then how many methods do we have to integrate for getting a successful prediction of a membrane protein? Our group have successfully addressed modelling of 3D structure of some membrane proteins starting from the sequence chain and validated with ad hoc wet experiments, including side directed mutagenesis, fluorescence spectroscopy or gene expression. Another major problem in large-sequence projects is the annotation of those genes which have no counterpart in the database of presently known sequences with a given function. And then we may ask: can we contribute to the annotation process with predictive methods? Recently we set within a European project (Biosapiens) we asked the question of how many membrane proteins are present in the human genome. With a method that relies on the integration of different predictors of membrane topology we found that out of 32001 unique sequences of the Ensemble release 35a1 (December 2004), 32.3 and 31.2% are predicted as membrane proteins by MEMSAT and ENSEMBLE, respectively, (25% are predicted by both predictors). 19.6% of the sequences are predicted as membrane proteins by TMHMM2.0 (single sequenced-based), and 19% are predicted by all predictors. These results set the lower and upper bound for the membrane protein content of the human genome and allow a list of putative membrane proteins for further applications.
Conference Proceedings of the European School of Genetic Medicine, 6th course in Bioinformatics for Molecular Biologists
22
22
Casadio R.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/29169
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact