: One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables-a signature-for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text]-[Formula: see text]). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models.

A network approach for low dimensional signatures from high throughput data / Curti, Nico; Levi, Giuseppe; Giampieri, Enrico; Castellani, Gastone; Remondini, Daniel. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - ELETTRONICO. - 12:1(2022), pp. 22253.1-22253.9. [10.1038/s41598-022-25549-9]

A network approach for low dimensional signatures from high throughput data

Curti, Nico;Levi, Giuseppe;Giampieri, Enrico
;
Castellani, Gastone;Remondini, Daniel
2022

Abstract

: One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables-a signature-for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text]-[Formula: see text]). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models.
2022
A network approach for low dimensional signatures from high throughput data / Curti, Nico; Levi, Giuseppe; Giampieri, Enrico; Castellani, Gastone; Remondini, Daniel. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - ELETTRONICO. - 12:1(2022), pp. 22253.1-22253.9. [10.1038/s41598-022-25549-9]
Curti, Nico; Levi, Giuseppe; Giampieri, Enrico; Castellani, Gastone; Remondini, Daniel
File in questo prodotto:
File Dimensione Formato  
s41598-022-25549-9.pdf

accesso aperto

Descrizione: pdf manuscript
Tipo: Versione (PDF) editoriale
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 2.81 MB
Formato Adobe PDF
2.81 MB Adobe PDF Visualizza/Apri
41598_2022_25549_MOESM1_ESM.pdf

accesso aperto

Tipo: File Supplementare
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione 4.02 MB
Formato Adobe PDF
4.02 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/918739
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact