Background: Predicting the effect of single point variations on protein stability constitutes a crucial step toward understanding the relationship between protein structure and function. To this end, several methods have been developed to predict changes in the Gibbs free energy of unfolding (δδG) between wild type and variant proteins, using sequence and structure information. Most of the available methods however do not exhibit the anti-symmetric prediction property, which guarantees that the predicted δδG value for a variation is the exact opposite of that predicted for the reverse variation, i.e., δδG(A → B) = -δδG(B → A), where A and B are amino acids. Results: Here we introduce simple anti-symmetric features, based on evolutionary information, which are combined to define an untrained method, DDGun (DDG untrained). DDGun is a simple approach based on evolutionary information that predicts the δδG for single and multiple variations from sequence and structure information (DDGun3D). Our method achieves remarkable performance without any training on the experimental datasets, reaching Pearson correlation coefficients between predicted and measured δδG values of ~ 0.5 and ~ 0.4 for single and multiple site variations, respectively. Surprisingly, DDGun performances are comparable with those of state of the art methods. DDGun also naturally predicts multiple site variations, thereby defining a benchmark method for both single site and multiple site predictors. DDGun is anti-symmetric by construction predicting the value of the δδG of a reciprocal variation as almost equal (depending on the sequence profile) to -δδG of the direct variation. This is a valuable property that is missing in the majority of the methods. Conclusions: Evolutionary information alone combined in an untrained method can achieve remarkably high performances in the prediction of δδG upon protein mutation. Non-trained approaches like DDGun represent a valid benchmark both for scoring the predictive power of the individual features and for assessing the learning capability of supervised methods.

DDGun: An untrained method for the prediction of protein stability changes upon single and multiple point variations

Capriotti E.
;
2019

Abstract

Background: Predicting the effect of single point variations on protein stability constitutes a crucial step toward understanding the relationship between protein structure and function. To this end, several methods have been developed to predict changes in the Gibbs free energy of unfolding (δδG) between wild type and variant proteins, using sequence and structure information. Most of the available methods however do not exhibit the anti-symmetric prediction property, which guarantees that the predicted δδG value for a variation is the exact opposite of that predicted for the reverse variation, i.e., δδG(A → B) = -δδG(B → A), where A and B are amino acids. Results: Here we introduce simple anti-symmetric features, based on evolutionary information, which are combined to define an untrained method, DDGun (DDG untrained). DDGun is a simple approach based on evolutionary information that predicts the δδG for single and multiple variations from sequence and structure information (DDGun3D). Our method achieves remarkable performance without any training on the experimental datasets, reaching Pearson correlation coefficients between predicted and measured δδG values of ~ 0.5 and ~ 0.4 for single and multiple site variations, respectively. Surprisingly, DDGun performances are comparable with those of state of the art methods. DDGun also naturally predicts multiple site variations, thereby defining a benchmark method for both single site and multiple site predictors. DDGun is anti-symmetric by construction predicting the value of the δδG of a reciprocal variation as almost equal (depending on the sequence profile) to -δδG of the direct variation. This is a valuable property that is missing in the majority of the methods. Conclusions: Evolutionary information alone combined in an untrained method can achieve remarkably high performances in the prediction of δδG upon protein mutation. Non-trained approaches like DDGun represent a valid benchmark both for scoring the predictive power of the individual features and for assessing the learning capability of supervised methods.
Montanucci L.; Capriotti E.; Frank Y.; Ben-Tal N.; Fariselli P.
File in questo prodotto:
File Dimensione Formato  
Montanucci_et_al-2019-BMC_Bioinformatics.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 807.32 kB
Formato Adobe PDF
807.32 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/738203
Citazioni
  • ???jsp.display-item.citation.pmc??? 16
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 27
social impact