Predicting metabolic responses in genetic disorders via structural representation in machine learning

Sirocchi, C.; Biancucci, F.; Suffian, M.; Donati, M.; Ferretti, S.; Bogliolo, A.; Magnani, M.; Menotta, M.; Montagna, S.

doi:10.1007/s13748-024-00338-9

Metabolomics has emerged as a promising discipline in pharmaceuticals and preventive healthcare. However, analysing large metabolomics datasets remains challenging due to limited and incompletely annotated biological pathways. To address this limitation, we recently proposed training machine learning classifiers on molecular fingerprints of metabolites to predict their responses under specific conditions and analysing feature importance to identify key chemical configurations, providing insights into the affected biological processes. This study extends our previous research by evaluating various metabolite structural representations, including Morgan fingerprint and its variants, graph-based structural encodings and proposing novel representations to improve resolution and interpretability of the state-of-the-art approaches. These structural encodings were evaluated on mass spectrometry metabolomic data for a cellular model of the genetic disease Ataxia Telangiectasia. The study found that machine learning classifiers trained on the new representations improved in classification accuracy and interpretability. Notably, models trained on graph-based encoding do not exhibit performance gains, not even with pre-training on a larger metabolite dataset, underlining the efficacy of our proposed representations. Finally, feature importance analysis across different encoding methods consistently identifies similar structures as relevant for classification, underscoring the robustness of our approach across diverse structural representations.

Sirocchi C., Biancucci F., Suffian M., Donati M., Ferretti S., Bogliolo A., et al. (2024). Predicting metabolic responses in genetic disorders via structural representation in machine learning. PROGRESS IN ARTIFICIAL INTELLIGENCE, 1, 1-14 [10.1007/s13748-024-00338-9].

Predicting metabolic responses in genetic disorders via structural representation in machine learning

Sirocchi C.;Biancucci F.;Suffian M.;Donati M.;Ferretti S.;Bogliolo A.;Magnani M.;Menotta M.;Montagna S.

2024

Abstract

Metabolomics has emerged as a promising discipline in pharmaceuticals and preventive healthcare. However, analysing large metabolomics datasets remains challenging due to limited and incompletely annotated biological pathways. To address this limitation, we recently proposed training machine learning classifiers on molecular fingerprints of metabolites to predict their responses under specific conditions and analysing feature importance to identify key chemical configurations, providing insights into the affected biological processes. This study extends our previous research by evaluating various metabolite structural representations, including Morgan fingerprint and its variants, graph-based structural encodings and proposing novel representations to improve resolution and interpretability of the state-of-the-art approaches. These structural encodings were evaluated on mass spectrometry metabolomic data for a cellular model of the genetic disease Ataxia Telangiectasia. The study found that machine learning classifiers trained on the new representations improved in classification accuracy and interpretability. Notably, models trained on graph-based encoding do not exhibit performance gains, not even with pre-training on a larger metabolite dataset, underlining the efficacy of our proposed representations. Finally, feature importance analysis across different encoding methods consistently identifies similar structures as relevant for classification, underscoring the robustness of our approach across diverse structural representations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				PROGRESS IN ARTIFICIAL INTELLIGENCE
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s13748-024-00338-9
			
	Citazione
	
				Sirocchi C.,  Biancucci F.,  Suffian M.,  Donati M.,  Ferretti S.,  Bogliolo A., et al. (2024). Predicting metabolic responses in genetic disorders via structural representation in machine learning. PROGRESS IN ARTIFICIAL INTELLIGENCE, 1, 1-14 [10.1007/s13748-024-00338-9].
			
	Tutti gli autori
	
						Sirocchi C.; Biancucci F.; Suffian M.; Donati M.; Ferretti S.; Bogliolo A.; Magnani M.; Menotta M.; Montagna S.
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/994237

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

1

Nome	Dominio	Durata	Descrizione
s_.*	plu.mx	sessione	recupero grafico citazioni sociali da plumx
A_.*	core.ac.uk	7 giorni	recupero pubblicazioni consigliate per il pannello core-recommander
GS_.*	gstatic.com	richiesta http	visualizza grafico citazioni
CC_.*	creativecommons.org	richiesta http	visualizza licenza bitstream

CRIS Current Research Information System