Omics datasets, comprehensively characterizing biological samples at a molecular level, are continuously increasing in both complexity and dimensionality. In this scenario, there is a need for tools to improve data interpretability, expediting the process of extracting relevant biochemical information. Here we introduce the subspace discriminant index (SDI) for multi-component models, which points to the most promising components to explore pre-defined groups of observations, and can also be used to compare several modeling variants in terms of discriminative power. The SDI is especially useful during the initial exploration of a data set, in order to make informed decisions on, e.g., pre-processing or modeling variants for further analysis. The versatility and the efficiency of the proposed index is demonstrated in two real world omics case studies, including a highly complex multi-class problem. The code for the computation of the SDI is freely available in the Matlab MEDA toolbox and linked in the present manuscript. By boosting the interpretation capabilities, the SDI represents a significant addition to the chemometric toolbox.

Sara Tortorella, M.S. (2020). Subspace discriminant index to expedite exploration of multi-class omics data. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 206, 1-9 [10.1016/j.chemolab.2020.104160].

Subspace discriminant index to expedite exploration of multi-class omics data

Tullia Gallina Toschi;
2020

Abstract

Omics datasets, comprehensively characterizing biological samples at a molecular level, are continuously increasing in both complexity and dimensionality. In this scenario, there is a need for tools to improve data interpretability, expediting the process of extracting relevant biochemical information. Here we introduce the subspace discriminant index (SDI) for multi-component models, which points to the most promising components to explore pre-defined groups of observations, and can also be used to compare several modeling variants in terms of discriminative power. The SDI is especially useful during the initial exploration of a data set, in order to make informed decisions on, e.g., pre-processing or modeling variants for further analysis. The versatility and the efficiency of the proposed index is demonstrated in two real world omics case studies, including a highly complex multi-class problem. The code for the computation of the SDI is freely available in the Matlab MEDA toolbox and linked in the present manuscript. By boosting the interpretation capabilities, the SDI represents a significant addition to the chemometric toolbox.
2020
Sara Tortorella, M.S. (2020). Subspace discriminant index to expedite exploration of multi-class omics data. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 206, 1-9 [10.1016/j.chemolab.2020.104160].
Sara Tortorella, Maurizio Servili, Tullia Gallina Toschi, Gabriele Cruciani, José Camacho
File in questo prodotto:
File Dimensione Formato  
SDI_Chemolab.pdf

Open Access dal 16/11/2022

Tipo: Preprint
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 762.31 kB
Formato Adobe PDF
762.31 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/805581
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact