Motivation: Single Nucleotide Polymorphisms (SNPs) are the most frequent type of genetic variation in human population (Collins et al., 1998). Great interest is focused on missense SNPs (mSNPs) that are responsible of protein single point mutation since mutations occurring in coding regions may affect the gene functionality. mSNPs can be neutral or disease associated (Ng and Henikoff, 2002; Bell, 2004). The present possibility of retrieving a large dataset of annotated SNPs from the Swiss-Prot Database (Boeckmann et al., 2003) prompted the application of machine learning techniques to predict the insurgence of human diseases due to single point protein mutation starting from the protein sequence (Capriotti et al 2006). Methods: We developed a method based on support vector machines (SVMs) that starting from the protein sequence information and evolutionary information, when available, can predict whether a new phenotype derived from a nsSNP can be related to a genetic disease in humans. The system is based on two different SVMs, one is a SVM-sequence that performs predictions relying on sequence information alone, the other is a SVM-profile performing predictions on profile features when evolutionary information is available. Merging in a unique framework the two SVMs we get a hybrid predictive method (Capriotti et al 2006). Results: On a recent dataset (April 2008) of 34314 single point mutations, 48% of which are disease related, out of 7351 proteins, we show that our hybrid predictor can reach more than 72% accuracy (with a correlation coefficient of 45%) in the specific task of predicting whether a single point mutation can be disease related or not. Our method, although based on less information, reaches the same accuracy, with a higher correlation coefficient, of the other web-available predictors implementing different approaches (Ramensky et al., 2002 ; Ng and Henikoff, 2003). Moreover, differently from other methods, ours always gives a prediction (Capriotti et al 2006). We design a web server integrating our SVM predicting methods, called Predictor of human Deleterious Single Nucleotide Polymorphisms (PhD-SNP). The server is a user friendly resource that gives the possibility of retrieving predictions via e-mail. The submission form is very simple and the user has to paste the query sequence, to select the mutation position and the mutated residue in relative input boxes; furthermore he can choose the predictive method. Best results are obtained when evolutionary information is available and when it is possible to perform predictions using the hybrid predictive method.

PhD-SNP: a web server for the prediction of human genetic diseases associated to missense single nucleotide polymorphisms / Calabrese R.; Capriotti E.; Casadio R.. - STAMPA. - (2008), pp. 78-78. (Intervento presentato al convegno EMBNET08 tenutosi a Martina Franca (TA) nel 18-20/9/2008).

PhD-SNP: a web server for the prediction of human genetic diseases associated to missense single nucleotide polymorphisms

CALABRESE, REMO;CAPRIOTTI, EMIDIO;CASADIO, RITA
2008

Abstract

Motivation: Single Nucleotide Polymorphisms (SNPs) are the most frequent type of genetic variation in human population (Collins et al., 1998). Great interest is focused on missense SNPs (mSNPs) that are responsible of protein single point mutation since mutations occurring in coding regions may affect the gene functionality. mSNPs can be neutral or disease associated (Ng and Henikoff, 2002; Bell, 2004). The present possibility of retrieving a large dataset of annotated SNPs from the Swiss-Prot Database (Boeckmann et al., 2003) prompted the application of machine learning techniques to predict the insurgence of human diseases due to single point protein mutation starting from the protein sequence (Capriotti et al 2006). Methods: We developed a method based on support vector machines (SVMs) that starting from the protein sequence information and evolutionary information, when available, can predict whether a new phenotype derived from a nsSNP can be related to a genetic disease in humans. The system is based on two different SVMs, one is a SVM-sequence that performs predictions relying on sequence information alone, the other is a SVM-profile performing predictions on profile features when evolutionary information is available. Merging in a unique framework the two SVMs we get a hybrid predictive method (Capriotti et al 2006). Results: On a recent dataset (April 2008) of 34314 single point mutations, 48% of which are disease related, out of 7351 proteins, we show that our hybrid predictor can reach more than 72% accuracy (with a correlation coefficient of 45%) in the specific task of predicting whether a single point mutation can be disease related or not. Our method, although based on less information, reaches the same accuracy, with a higher correlation coefficient, of the other web-available predictors implementing different approaches (Ramensky et al., 2002 ; Ng and Henikoff, 2003). Moreover, differently from other methods, ours always gives a prediction (Capriotti et al 2006). We design a web server integrating our SVM predicting methods, called Predictor of human Deleterious Single Nucleotide Polymorphisms (PhD-SNP). The server is a user friendly resource that gives the possibility of retrieving predictions via e-mail. The submission form is very simple and the user has to paste the query sequence, to select the mutation position and the mutated residue in relative input boxes; furthermore he can choose the predictive method. Best results are obtained when evolutionary information is available and when it is possible to perform predictions using the hybrid predictive method.
2008
EMBNET08
78
78
PhD-SNP: a web server for the prediction of human genetic diseases associated to missense single nucleotide polymorphisms / Calabrese R.; Capriotti E.; Casadio R.. - STAMPA. - (2008), pp. 78-78. (Intervento presentato al convegno EMBNET08 tenutosi a Martina Franca (TA) nel 18-20/9/2008).
Calabrese R.; Capriotti E.; Casadio R.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/73471
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact