Motivation: A basic question of protein structural studies is to which extent mutations affect the stability. This question may be addressed starting from sequence and/or from structure. In proteomics and genomics studies prediction of protein stability free energy change (ΔΔG) upon single point mutation may also help the annotation process. The experimental ΔΔG values are affected by uncertainty as measured by standard deviations. Most of the ΔΔG values are nearly zero (about 32% of the ΔΔG data set ranges from −0.5 to 0.5 kcal/mole) and both the value and sign of ΔΔG may be either positive or negative for the same mutation blurring the relationship among mutations and expected ΔΔG value. Methods: In order to overcome this problem we describe a new predictor that discriminates between 3 mutation classes: destabilizing mutations (ΔΔG1.0 kcal/mole) and neutral mutations (−1.0≤ΔΔG≤1.0 kcal/mole). We recently developed the I-Mutant Suite that incorporates different machine learning algorithms to predict the effects of non synonymous Single Nucleotide Polymorphism (nsSNPs) in coding regions. The effect of nsSNPs can be studied considering two main aspects: folding stability and protein functionality loss (affecting human health). I-Mutant Suite provides predictions for both aspects, integrating and improving previously developed predictors. Results: In particular the new version of I-Mutant-ΔΔG predicts both the sign and the value of the free energy change (ΔΔG) upon single point protein mutations, scoring in the classification task with a 80% accuracy and with a correlation coefficient of 0.70 to experimentally detected ΔΔG values, when structure is adopted as input. Furthermore, to better grading the stability predictions, I-Mutant3.0 also provides the capability of discriminating between 3 classes, namely destabilizing, neutral and stabilizing (ΔΔG >0.5 Kcal/mol), scoring with an accuracy of 64% and 68% when protein sequence and structural information are considered, respectively. The I-Mutant Suite also includes I-Mutant-Disease, a new SVM-based predictor able to discriminate between neutral and disease-related polymorphisms. The predictor is endowed with 76% overall accuracy and with a correlation coefficient of 0.52. I-Mutant Suite is the first web server that integrates in a unique framework the predictions of the folding free energy changes and disease-related effects upon single point protein mutation. Our method improves the quality of the prediction of the free energy change due to single point protein mutations by adopting a hypothesis of thermodynamic reversibility of the existing experimental data. By this we both recast the thermodynamic symmetry of the problem and balance the distribution of the available experimental measurements of free energy changes. This eliminates possible overestimations of the previously described methods trained on an unbalanced data set comprising a number of destabilizing mutations higher than stabilizing ones.

Calabrese R., Capriotti E., Fariselli P., Martelli P.L., Casadio R. (2009). Protein Folding, Misfolding and Diseases: The I-Mutant Suite. s.l : s.n.

Protein Folding, Misfolding and Diseases: The I-Mutant Suite

CALABRESE, REMO;CAPRIOTTI, EMIDIO;FARISELLI, PIERO;MARTELLI, PIER LUIGI;CASADIO, RITA
2009

Abstract

Motivation: A basic question of protein structural studies is to which extent mutations affect the stability. This question may be addressed starting from sequence and/or from structure. In proteomics and genomics studies prediction of protein stability free energy change (ΔΔG) upon single point mutation may also help the annotation process. The experimental ΔΔG values are affected by uncertainty as measured by standard deviations. Most of the ΔΔG values are nearly zero (about 32% of the ΔΔG data set ranges from −0.5 to 0.5 kcal/mole) and both the value and sign of ΔΔG may be either positive or negative for the same mutation blurring the relationship among mutations and expected ΔΔG value. Methods: In order to overcome this problem we describe a new predictor that discriminates between 3 mutation classes: destabilizing mutations (ΔΔG1.0 kcal/mole) and neutral mutations (−1.0≤ΔΔG≤1.0 kcal/mole). We recently developed the I-Mutant Suite that incorporates different machine learning algorithms to predict the effects of non synonymous Single Nucleotide Polymorphism (nsSNPs) in coding regions. The effect of nsSNPs can be studied considering two main aspects: folding stability and protein functionality loss (affecting human health). I-Mutant Suite provides predictions for both aspects, integrating and improving previously developed predictors. Results: In particular the new version of I-Mutant-ΔΔG predicts both the sign and the value of the free energy change (ΔΔG) upon single point protein mutations, scoring in the classification task with a 80% accuracy and with a correlation coefficient of 0.70 to experimentally detected ΔΔG values, when structure is adopted as input. Furthermore, to better grading the stability predictions, I-Mutant3.0 also provides the capability of discriminating between 3 classes, namely destabilizing, neutral and stabilizing (ΔΔG >0.5 Kcal/mol), scoring with an accuracy of 64% and 68% when protein sequence and structural information are considered, respectively. The I-Mutant Suite also includes I-Mutant-Disease, a new SVM-based predictor able to discriminate between neutral and disease-related polymorphisms. The predictor is endowed with 76% overall accuracy and with a correlation coefficient of 0.52. I-Mutant Suite is the first web server that integrates in a unique framework the predictions of the folding free energy changes and disease-related effects upon single point protein mutation. Our method improves the quality of the prediction of the free energy change due to single point protein mutations by adopting a hypothesis of thermodynamic reversibility of the existing experimental data. By this we both recast the thermodynamic symmetry of the problem and balance the distribution of the available experimental measurements of free energy changes. This eliminates possible overestimations of the previously described methods trained on an unbalanced data set comprising a number of destabilizing mutations higher than stabilizing ones.
2009
Proceedings og BITS 09
X
X
Calabrese R., Capriotti E., Fariselli P., Martelli P.L., Casadio R. (2009). Protein Folding, Misfolding and Diseases: The I-Mutant Suite. s.l : s.n.
Calabrese R.; Capriotti E.; Fariselli P.; Martelli P.L.; Casadio R.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/85664
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact