Predicting the functional impact of protein variation is one of the most challenging problems in Bioinformatics with direct implications for biomedicine. A rapidly growing number of genome-scale studies provide large amounts of experimental data allowing the application of rigorous statistical approaches for predicting if a given single point mutation has or not an impact on human health. Up until now, existing methods have limited their source data to either protein or gene information. Novel in this work, we take advantage of both and focus on protein evolutionary information by using estimated selective pressures at the codon level. Here we introduce a new method called SeqProfCod (acronym for sequence, profile and codon information) to predict the likeliness that a given protein variant is associated or not with human disease. In this work we also demonstrate that the majority of human mutations that are associated with disease are also under strong purifying selection ((ω<0.1). Therefore, our method relies on three sources of information: protein sequence, multiple protein sequence alignments and the estimation of selective pressure at the codon level. SeqProfCod has been benchmarked with a large dataset of 8,987 single point mutations from 1,434 human proteins from SWISS-PROT. It achieves 82% overall accuracy and a correlation coefficient of 0.59 demonstrating the synergic effect of the three sources of information. The results of large-scale application of SeqProfCod over all annotated point mutations in SWISS-PROT, which are available for download at http://bioinfo.cipf.es/sgu/services/SeqProfCod/, could be used to support clinical studies.

Selective pressure at the codon level improves the prediction of disease related protein mutations in human

CAPRIOTTI, EMIDIO;CASADIO, RITA;
2008

Abstract

Predicting the functional impact of protein variation is one of the most challenging problems in Bioinformatics with direct implications for biomedicine. A rapidly growing number of genome-scale studies provide large amounts of experimental data allowing the application of rigorous statistical approaches for predicting if a given single point mutation has or not an impact on human health. Up until now, existing methods have limited their source data to either protein or gene information. Novel in this work, we take advantage of both and focus on protein evolutionary information by using estimated selective pressures at the codon level. Here we introduce a new method called SeqProfCod (acronym for sequence, profile and codon information) to predict the likeliness that a given protein variant is associated or not with human disease. In this work we also demonstrate that the majority of human mutations that are associated with disease are also under strong purifying selection ((ω<0.1). Therefore, our method relies on three sources of information: protein sequence, multiple protein sequence alignments and the estimation of selective pressure at the codon level. SeqProfCod has been benchmarked with a large dataset of 8,987 single point mutations from 1,434 human proteins from SWISS-PROT. It achieves 82% overall accuracy and a correlation coefficient of 0.59 demonstrating the synergic effect of the three sources of information. The results of large-scale application of SeqProfCod over all annotated point mutations in SWISS-PROT, which are available for download at http://bioinfo.cipf.es/sgu/services/SeqProfCod/, could be used to support clinical studies.
Proceedings of the VIII Jornadas de Bioinformatica
28
28
Capriotti E.; Arbiza L.; Casadio R.; Dopazo J.; Dopazo H.; Marti-Renom M.A.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/56248
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact