Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of proteins with disease-related and benign variations, as derived by merging Humsavar and ClinVar data, we investigate to which extent our physicochemical grouping procedure can help in determining whether patterns of variation types are related to specific groups of diseases and whether they occur in Pfam and/or InterPro gene domains. Here, we download 75,145 germline disease-related and benign variations of 3,605 genes, group them according to physicochemical categories and map them into Pfam and InterPro gene domains. Statistically validated analysis indicates that each cluster of genes associated to Mondo anatomical system categorizations is characterized by a specific variation pattern. Patterns identify specific Pfam and InterPro domain-Mondo category associations. Our data suggest that the association of variation patterns to Mondo categories is unique and may help in associating gene variants to genetic diseases. This work corroborates in a much larger data set previous observations from our group.
Babbi, G., Savojardo, C., Baldazzi, D., Martelli, P.L., Casadio, R. (2022). Pathogenic variation types in human genes relate to diseases through Pfam and InterPro mapping. FRONTIERS IN MOLECULAR BIOSCIENCES, 9, 1-12 [10.3389/fmolb.2022.966927].
Pathogenic variation types in human genes relate to diseases through Pfam and InterPro mapping
Babbi, Giulia;Savojardo, Castrense;Baldazzi, Davide;Martelli, Pier Luigi
;Casadio, Rita
2022
Abstract
Grouping residue variations in a protein according to their physicochemical properties allows a dimensionality reduction of all the possible substitutions in a variant with respect to the wild type. Here, by using a large dataset of proteins with disease-related and benign variations, as derived by merging Humsavar and ClinVar data, we investigate to which extent our physicochemical grouping procedure can help in determining whether patterns of variation types are related to specific groups of diseases and whether they occur in Pfam and/or InterPro gene domains. Here, we download 75,145 germline disease-related and benign variations of 3,605 genes, group them according to physicochemical categories and map them into Pfam and InterPro gene domains. Statistically validated analysis indicates that each cluster of genes associated to Mondo anatomical system categorizations is characterized by a specific variation pattern. Patterns identify specific Pfam and InterPro domain-Mondo category associations. Our data suggest that the association of variation patterns to Mondo categories is unique and may help in associating gene variants to genetic diseases. This work corroborates in a much larger data set previous observations from our group.File | Dimensione | Formato | |
---|---|---|---|
fmolb-09-966927-1.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
2.63 MB
Formato
Adobe PDF
|
2.63 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.