Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.

CAGI SickKids challenges: Assessment of phenotype and variant predictions derived from clinical and genomic data of children with undiagnosed diseases

Babbi G.;Casadio R.;Martelli P. L.;
2019

Abstract

Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.
Kasak L.; Hunter J.M.; Udani R.; Bakolitsa C.; Hu Z.; Adhikari A.N.; Babbi G.; Casadio R.; Gough J.; Guerrero R.F.; Jiang Y.; Joseph T.; Katsonis P.; Kotte S.; Kundu K.; Lichtarge O.; Martelli P.L.; Mooney S.D.; Moult J.; Pal L.R.; Poitras J.; Radivojac P.; Rao A.; Sivadasan N.; Sunderam U.; Saipradeep V.G.; Yin Y.; Zaucha J.; Brenner S.E.; Meyn M.S.
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/732791
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 4
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 5
social impact