Omics datasets, comprehensively characterizing biological samples at a molecular level, are continuously increasing in both complexity and dimensionality. In this scenario, there is a need for tools to improve data interpretability, expediting the process of extracting relevant biochemical information. Here we introduce the subspace discriminant index (SDI) for multi-component models, which points to the most promising components to explore pre-defined groups of observations, and can also be used to compare several modeling variants in terms of discriminative power. The SDI is especially useful during the initial exploration of a data set, in order to make informed decisions on, e.g., pre-processing or modeling variants for further analysis. The versatility and the efficiency of the proposed index is demonstrated in two real world omics case studies, including a highly complex multi-class problem. The code for the computation of the SDI is freely available in the Matlab MEDA toolbox and linked in the present manuscript. By boosting the interpretation capabilities, the SDI represents a significant addition to the chemometric toolbox.
Sara Tortorella, M.S. (2020). Subspace discriminant index to expedite exploration of multi-class omics data. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 206, 1-9 [10.1016/j.chemolab.2020.104160].
Subspace discriminant index to expedite exploration of multi-class omics data
Tullia Gallina Toschi;
2020
Abstract
Omics datasets, comprehensively characterizing biological samples at a molecular level, are continuously increasing in both complexity and dimensionality. In this scenario, there is a need for tools to improve data interpretability, expediting the process of extracting relevant biochemical information. Here we introduce the subspace discriminant index (SDI) for multi-component models, which points to the most promising components to explore pre-defined groups of observations, and can also be used to compare several modeling variants in terms of discriminative power. The SDI is especially useful during the initial exploration of a data set, in order to make informed decisions on, e.g., pre-processing or modeling variants for further analysis. The versatility and the efficiency of the proposed index is demonstrated in two real world omics case studies, including a highly complex multi-class problem. The code for the computation of the SDI is freely available in the Matlab MEDA toolbox and linked in the present manuscript. By boosting the interpretation capabilities, the SDI represents a significant addition to the chemometric toolbox.File | Dimensione | Formato | |
---|---|---|---|
SDI_Chemolab.pdf
Open Access dal 16/11/2022
Tipo:
Preprint
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione
762.31 kB
Formato
Adobe PDF
|
762.31 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.