As the amount of genomic variation data increases, tools that are able to score the functional impact of single nucleotide variants become more and more necessary. While there are several prediction servers available for interpreting the effects of variants in the human genome, only few have been developed for other species, and none were specifically designed for species of veterinary interest such as the dog. Here, we present Fido-SNP the first predictor able to discriminate between Pathogenic and Benign single-nucleotide variants in the dog genome. Fido-SNP is a binary classifier based on the Gradient Boosting algorithm. It is able to classify and score the impact of variants in both coding and non-coding regions based on sequence features within seconds. When validated on a previously unseen set of annotated variants from the OMIA database, Fido-SNP reaches 88% overall accuracy, 0.77 Matthews correlation coefficient and 0.91 Area Under the ROC Curve.
Capriotti, E., Montanucci, L., Profiti, G., Rossi, I., Giannuzzi, D., Aresu, L., et al. (2019). Fido-SNP: the first webserver for scoring the impact of single nucleotide variants in the dog genome. NUCLEIC ACIDS RESEARCH, 47(W1), 136-141 [10.1093/nar/gkz420].
Fido-SNP: the first webserver for scoring the impact of single nucleotide variants in the dog genome
Capriotti, Emidio;Montanucci, Ludovica;Profiti, Giuseppe;Rossi, Ivan;Fariselli, Piero
2019
Abstract
As the amount of genomic variation data increases, tools that are able to score the functional impact of single nucleotide variants become more and more necessary. While there are several prediction servers available for interpreting the effects of variants in the human genome, only few have been developed for other species, and none were specifically designed for species of veterinary interest such as the dog. Here, we present Fido-SNP the first predictor able to discriminate between Pathogenic and Benign single-nucleotide variants in the dog genome. Fido-SNP is a binary classifier based on the Gradient Boosting algorithm. It is able to classify and score the impact of variants in both coding and non-coding regions based on sequence features within seconds. When validated on a previously unseen set of annotated variants from the OMIA database, Fido-SNP reaches 88% overall accuracy, 0.77 Matthews correlation coefficient and 0.91 Area Under the ROC Curve.File | Dimensione | Formato | |
---|---|---|---|
gkz420.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale (CCBYNC)
Dimensione
690.04 kB
Formato
Adobe PDF
|
690.04 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.