The prediction of the effect of Single Nucleotide Polymorphisms (SNPs) is one of the most ambitious challenges in computational biology. SNPs account for about 90% of genetic variations in human population. Recent investigations are focused on non-synonymous coding SNPs that are responsible of protein single point mutation, since mutations occurring in coding regions may affect gene functionality. Gene Ontology (GO) provides a curated vocabulary to describe gene's functionality. We propose a GO log-odd based score to discriminate functionally relevant genes. In this work we present a machine learning-based method to predict the effect of a given mutation on human health. In particular the relationship between SNPs and the insurgence of cancer has been studied using a support vector machine (SVM). Hence, we developed a SVM-based predictor (PhD-SNP-C) that put together in a unique input vector various features derived from protein sequence, profile, and a GO-based score. The predictor here proposed reaches the overall accuracy of 75% and a correlation coefficient of 0.50 on a set of 1087 cancer-related mutations and 1100 random selected mutation annotated as neutral in the Swiss-Prot dataset. On the same set of proteins a similar SVM-based method that does not take in to account the GO log-odd score scores 66% of accuracy with a correlation coefficient of 0.32. Our results indicate that the inclusion of the information derived from the GO annotations improves the prediction of cancer-related mutation. Overall, the prediction values computed by the PhD-SNP-C method are 9% more accurate than those obtained on our previous SVM-based method with a gain of the correlation coefficient value of 0.18. Furthermore, if the results are filtered according to their reliability index (RI) at RI 3 (comprising 71% of the dataset.), PhD-SNP-C score as high as 82% of accuracy and with a 0.64 value of the correlation coefficient.

Calabrese R., Capriotti E., Martelli P.L., Fariselli P., Casadio R. (2008). Gene Ontology annotation improves the prediction of cancer-related mutations. INNSBRUCK : s.n.

Gene Ontology annotation improves the prediction of cancer-related mutations

CALABRESE, REMO;CAPRIOTTI, EMIDIO;MARTELLI, PIER LUIGI;FARISELLI, PIERO;CASADIO, RITA
2008

Abstract

The prediction of the effect of Single Nucleotide Polymorphisms (SNPs) is one of the most ambitious challenges in computational biology. SNPs account for about 90% of genetic variations in human population. Recent investigations are focused on non-synonymous coding SNPs that are responsible of protein single point mutation, since mutations occurring in coding regions may affect gene functionality. Gene Ontology (GO) provides a curated vocabulary to describe gene's functionality. We propose a GO log-odd based score to discriminate functionally relevant genes. In this work we present a machine learning-based method to predict the effect of a given mutation on human health. In particular the relationship between SNPs and the insurgence of cancer has been studied using a support vector machine (SVM). Hence, we developed a SVM-based predictor (PhD-SNP-C) that put together in a unique input vector various features derived from protein sequence, profile, and a GO-based score. The predictor here proposed reaches the overall accuracy of 75% and a correlation coefficient of 0.50 on a set of 1087 cancer-related mutations and 1100 random selected mutation annotated as neutral in the Swiss-Prot dataset. On the same set of proteins a similar SVM-based method that does not take in to account the GO log-odd score scores 66% of accuracy with a correlation coefficient of 0.32. Our results indicate that the inclusion of the information derived from the GO annotations improves the prediction of cancer-related mutation. Overall, the prediction values computed by the PhD-SNP-C method are 9% more accurate than those obtained on our previous SVM-based method with a gain of the correlation coefficient value of 0.18. Furthermore, if the results are filtered according to their reliability index (RI) at RI 3 (comprising 71% of the dataset.), PhD-SNP-C score as high as 82% of accuracy and with a 0.64 value of the correlation coefficient.
2008
3rd ESF Functional Genomics Conference
87
87
Calabrese R., Capriotti E., Martelli P.L., Fariselli P., Casadio R. (2008). Gene Ontology annotation improves the prediction of cancer-related mutations. INNSBRUCK : s.n.
Calabrese R.; Capriotti E.; Martelli P.L.; Fariselli P.; Casadio R.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/73487
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact