The prediction of the effect of Single Nucleotide Polymorphisms (SNPs) is one of the most ambitious challenges in computational biology. SNPs account for about 90% of genetic variations in human population. Recent investigations are focused on non-synonymous coding SNPs that are responsible of protein single point mutation, since mutations occurring in coding regions may affect gene functionality. Gene Ontology (GO) provides a curated vocabulary to describe gene's functionality. We propose a GO log-odd based score to discriminate functionally relevant genes. In this work we present a machine learning-based method to predict the effect of a given mutation on human health. In particular the relationship between SNPs and the insurgence of cancer has been studied using a support vector machine (SVM). Hence, we developed a SVM-based predictor (PhD-SNP-C) that put together in a unique input vector various features derived from protein sequence, profile, and a GO-based score. The predictor here proposed reaches the overall accuracy of 75% and a correlation coefficient of 0.50 on a set of 1087 cancer-related mutations and 1100 random selected mutation annotated as neutral in the Swiss-Prot dataset. On the same set of proteins a similar SVM-based method that does not take in to account the GO log-odd score scores 66% of accuracy with a correlation coefficient of 0.32. Our results indicate that the inclusion of the information derived from the GO annotations improves the prediction of cancer-related mutation. Overall, the prediction values computed by the PhD-SNP-C method are 9% more accurate than those obtained on our previous SVM-based method with a gain of the correlation coefficient value of 0.18. Furthermore, if the results are filtered according to their reliability index (RI) at RI 3 (comprising 71% of the dataset.), PhD-SNP-C score as high as 82% of accuracy and with a 0.64 value of the correlation coefficient.

Gene Ontology annotation improves the prediction of cancer-related mutations / Calabrese R.; Capriotti E.; Casadio R.. - STAMPA. - (2008), pp. 25-25. (Intervento presentato al convegno One day BITS meeting tenutosi a Roma nel 4/7/2008).

Gene Ontology annotation improves the prediction of cancer-related mutations

CALABRESE, REMO;CAPRIOTTI, EMIDIO;CASADIO, RITA
2008

Abstract

The prediction of the effect of Single Nucleotide Polymorphisms (SNPs) is one of the most ambitious challenges in computational biology. SNPs account for about 90% of genetic variations in human population. Recent investigations are focused on non-synonymous coding SNPs that are responsible of protein single point mutation, since mutations occurring in coding regions may affect gene functionality. Gene Ontology (GO) provides a curated vocabulary to describe gene's functionality. We propose a GO log-odd based score to discriminate functionally relevant genes. In this work we present a machine learning-based method to predict the effect of a given mutation on human health. In particular the relationship between SNPs and the insurgence of cancer has been studied using a support vector machine (SVM). Hence, we developed a SVM-based predictor (PhD-SNP-C) that put together in a unique input vector various features derived from protein sequence, profile, and a GO-based score. The predictor here proposed reaches the overall accuracy of 75% and a correlation coefficient of 0.50 on a set of 1087 cancer-related mutations and 1100 random selected mutation annotated as neutral in the Swiss-Prot dataset. On the same set of proteins a similar SVM-based method that does not take in to account the GO log-odd score scores 66% of accuracy with a correlation coefficient of 0.32. Our results indicate that the inclusion of the information derived from the GO annotations improves the prediction of cancer-related mutation. Overall, the prediction values computed by the PhD-SNP-C method are 9% more accurate than those obtained on our previous SVM-based method with a gain of the correlation coefficient value of 0.18. Furthermore, if the results are filtered according to their reliability index (RI) at RI 3 (comprising 71% of the dataset.), PhD-SNP-C score as high as 82% of accuracy and with a 0.64 value of the correlation coefficient.
2008
One day BITS meeting
25
25
Gene Ontology annotation improves the prediction of cancer-related mutations / Calabrese R.; Capriotti E.; Casadio R.. - STAMPA. - (2008), pp. 25-25. (Intervento presentato al convegno One day BITS meeting tenutosi a Roma nel 4/7/2008).
Calabrese R.; Capriotti E.; Casadio R.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/73463
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact