Statistical evaluation of diagnostic tests, and, more generally, of biomarkers, is a constantly developing field, in which complexity of the assessment increases with complexity of the design under which data are collected. One particularly prevalent type of data is clustered data, where individual units are naturally nested into clusters. In these cases, bias can arise from omission, in the evaluation process, of cluster-level effects and/or individual covariates. Focussing on the three-class case and for continuous-valued diagnostic tests, we investigate how to exploit the clustered structure of data within a linear-mixed model approach, both when the assumption of normality holds and when it does not. We provide a method for estimation of covariate-specific ROC surfaces and discuss methods for the choice of optimal thresholds, proposing three possible estimators. A proof of consistency and asymptotic normality of the proposed threshold estimators is given. All considered methods are evaluated by extensive simulation experiments. As an application, we study the use of the Lysosomal Associated Membrane Protein Family Member 5 (Lamp5) gene expression as biomarker to distinguish among three types of glutamatergic neurons.

TO, D.K., ADIMARI, G., CHIOGNA, M., RISSO DAVIDE (2022). ROC estimation and threshold selection criteria in three-class classification problems for clustered data. STATISTICAL METHODS IN MEDICAL RESEARCH, 31(7), 1325-1341 [10.1177/09622802221089029].

ROC estimation and threshold selection criteria in three-class classification problems for clustered data

CHIOGNA, MONICA;
2022

Abstract

Statistical evaluation of diagnostic tests, and, more generally, of biomarkers, is a constantly developing field, in which complexity of the assessment increases with complexity of the design under which data are collected. One particularly prevalent type of data is clustered data, where individual units are naturally nested into clusters. In these cases, bias can arise from omission, in the evaluation process, of cluster-level effects and/or individual covariates. Focussing on the three-class case and for continuous-valued diagnostic tests, we investigate how to exploit the clustered structure of data within a linear-mixed model approach, both when the assumption of normality holds and when it does not. We provide a method for estimation of covariate-specific ROC surfaces and discuss methods for the choice of optimal thresholds, proposing three possible estimators. A proof of consistency and asymptotic normality of the proposed threshold estimators is given. All considered methods are evaluated by extensive simulation experiments. As an application, we study the use of the Lysosomal Associated Membrane Protein Family Member 5 (Lamp5) gene expression as biomarker to distinguish among three types of glutamatergic neurons.
2022
TO, D.K., ADIMARI, G., CHIOGNA, M., RISSO DAVIDE (2022). ROC estimation and threshold selection criteria in three-class classification problems for clustered data. STATISTICAL METHODS IN MEDICAL RESEARCH, 31(7), 1325-1341 [10.1177/09622802221089029].
TO, DUC KHANH; ADIMARI, GIANFRANCO; CHIOGNA, MONICA; RISSO DAVIDE
File in questo prodotto:
File Dimensione Formato  
11585_877960.pdf

accesso aperto

Descrizione: Post-print con copertina
Tipo: Postprint
Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione - Non commerciale - Non opere derivate (CCBYNCND)
Dimensione 1.61 MB
Formato Adobe PDF
1.61 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/877960
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact