Metabolomics is emerging as a novel tool for characterising the molecular phenome, providing insights into internal phenotypes that result from interactions between different molecular layers. The aim of this study was to identify metabolic differences between pig breeds by analysing the plasma metabolome profiles of approximately 700 metabolites in over 1,000 pigs from two breeds: Italian Large White and Italian Duroc pigs. After data quality control, we used a bioinformatics pipeline specifically designed to identify differentially abundant metabolites between the two breeds. This pipeline included the Boruta algorithm, a Random Forest wrapper, and sparse Partial Least Squares Discriminant Analysis (sPLS-DA). We compared the results obtained from these two algorithms to assess the stability of the selection of molecular features. In total, we identified 100 metabolites: 17 from sPLSDA and 83 from the Boruta analyses. The differences in feature selection were due to the characteristics of the algorithms chosen: the former related to the minimal-optimal problem, while the latter related to the all-relevant problem. Furthermore, the selection performed by Boruta was found to be more stable through the tests we conducted. From a biological perspective, the observed differences in these molecular phenotypes can be used to describe genetic differences between Italian Large White and Italian Duroc pigs. Acknowledgements: This study has received funding from the European Union’s Horizon Europe research and innovation programme under the grant agreement No. 01059609 (Re-Livestock project).
Bovo, S., Bolner, M., Schiavo, G., Galimberti, G., Bertolini, F., Ribani, A., et al. (2025). Exploring the animal molecular phenome with machine learning algorithms: mining the plasma metabolome to describe differences between breeds.
Exploring the animal molecular phenome with machine learning algorithms: mining the plasma metabolome to describe differences between breeds
S. Bovo;M. Bolner;G. Schiavo;G. Galimberti;F. Bertolini;A. Ribani;S. Dall’Olio;L. Fontanesi
2025
Abstract
Metabolomics is emerging as a novel tool for characterising the molecular phenome, providing insights into internal phenotypes that result from interactions between different molecular layers. The aim of this study was to identify metabolic differences between pig breeds by analysing the plasma metabolome profiles of approximately 700 metabolites in over 1,000 pigs from two breeds: Italian Large White and Italian Duroc pigs. After data quality control, we used a bioinformatics pipeline specifically designed to identify differentially abundant metabolites between the two breeds. This pipeline included the Boruta algorithm, a Random Forest wrapper, and sparse Partial Least Squares Discriminant Analysis (sPLS-DA). We compared the results obtained from these two algorithms to assess the stability of the selection of molecular features. In total, we identified 100 metabolites: 17 from sPLSDA and 83 from the Boruta analyses. The differences in feature selection were due to the characteristics of the algorithms chosen: the former related to the minimal-optimal problem, while the latter related to the all-relevant problem. Furthermore, the selection performed by Boruta was found to be more stable through the tests we conducted. From a biological perspective, the observed differences in these molecular phenotypes can be used to describe genetic differences between Italian Large White and Italian Duroc pigs. Acknowledgements: This study has received funding from the European Union’s Horizon Europe research and innovation programme under the grant agreement No. 01059609 (Re-Livestock project).I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


