Statistical evaluation of diagnostic tests, and, more generally, of biomarkers, is a constantly developing field, in which complexity of the assessment increases with complexity of the design under which data are collected. One particularly prevalent type of data is clustered data, where individual units are naturally nested into clusters. In these cases, bias can arise from omission, in the evaluation process, of cluster-level effects and/or individual covariates. Focussing on the three-class case and for continuous-valued diagnostic tests, we investigate how to exploit the clustered structure of data within a linear-mixed model approach, both when the assumption of normality holds and when it does not. We provide a method for estimation of covariate-specific ROC surfaces and discuss methods for the choice of optimal thresholds, proposing three possible estimators. A proof of consistency and asymptotic normality of the proposed threshold estimators is given. All considered methods are evaluated by extensive simulation experiments. As an application, we study the use of the Lysosomal Associated Membrane Protein Family Member 5 (Lamp5) gene expression as biomarker to distinguish among three types of glutamatergic neurons.

ROC estimation and threshold selection criteria in three-class classification problems for clustered data

CHIOGNA, MONICA;
2022

Abstract

Statistical evaluation of diagnostic tests, and, more generally, of biomarkers, is a constantly developing field, in which complexity of the assessment increases with complexity of the design under which data are collected. One particularly prevalent type of data is clustered data, where individual units are naturally nested into clusters. In these cases, bias can arise from omission, in the evaluation process, of cluster-level effects and/or individual covariates. Focussing on the three-class case and for continuous-valued diagnostic tests, we investigate how to exploit the clustered structure of data within a linear-mixed model approach, both when the assumption of normality holds and when it does not. We provide a method for estimation of covariate-specific ROC surfaces and discuss methods for the choice of optimal thresholds, proposing three possible estimators. A proof of consistency and asymptotic normality of the proposed threshold estimators is given. All considered methods are evaluated by extensive simulation experiments. As an application, we study the use of the Lysosomal Associated Membrane Protein Family Member 5 (Lamp5) gene expression as biomarker to distinguish among three types of glutamatergic neurons.
STATISTICAL METHODS IN MEDICAL RESEARCH
TO, DUC KHANH; ADIMARI, GIANFRANCO; CHIOGNA, MONICA; RISSO DAVIDE
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11585/877960
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact