Antimicrobial Resistance (AMR) is a global health problem which is estimated to cause ~10 million deaths every year by 2050. The possibility to detect antimicrobial resistant genes and bacteria in environmental and biological samples is crucial for the detection and monitoring of AMR, as well as to identify effective strategies. To this aim, a promising approach consists in the combination of high-throughput technologies (e.g. shotgun sequencing) with bioinformatics and Machine Learning. However, the high complexity of real metagenomic samples makes the validation of the results a challenging task. In order to evaluate the capability of Machine Learning models to predict the presence of AMR in shotgun sequencing samples, we exploited a modified version of the CAMISIM simulator to generated synthetic data with different resistance profiles, starting from annotated genomes retrieved from the PATRIC database. Our approach allowed us to compare the performances of different bioinformatic and Machine Learning pipelines.

Evaluation of Machine Learning models for the detection of Antimicrobial Resistance based on Synthetic Data

Claudia Sala
Primo
;
Adriano Zaghi;Ettore Rocchi;Nicolas R. Derus;Alessandra De Cesare;Gastone Castellani
Ultimo
2022

Abstract

Antimicrobial Resistance (AMR) is a global health problem which is estimated to cause ~10 million deaths every year by 2050. The possibility to detect antimicrobial resistant genes and bacteria in environmental and biological samples is crucial for the detection and monitoring of AMR, as well as to identify effective strategies. To this aim, a promising approach consists in the combination of high-throughput technologies (e.g. shotgun sequencing) with bioinformatics and Machine Learning. However, the high complexity of real metagenomic samples makes the validation of the results a challenging task. In order to evaluate the capability of Machine Learning models to predict the presence of AMR in shotgun sequencing samples, we exploited a modified version of the CAMISIM simulator to generated synthetic data with different resistance profiles, starting from annotated genomes retrieved from the PATRIC database. Our approach allowed us to compare the performances of different bioinformatic and Machine Learning pipelines.
2022
XXII International Conference on Mechanics in Medicine and Biology - Abstract book
16
16
Claudia Sala, Adriano Zaghi, Ettore Rocchi, Nicolas R. Derus, Alessandra De Cesare, Gastone Castellani
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/903947
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact