The problem of model selection is inevitable in an increasingly large number of applications involving partial theoretical knowledge and vast amounts of information, like in medicine, biology or economics. The associated techniques are intended to determine which variables are "important" to "explain a phenomenon under investigation. The terms "important" and "explain" can have very different meanings according to the context and, in fact, model selection can be applied to any situation where one tries to balance variability with complexity. In this paper, we introduce a new class of error measures and of model selection criteria, to which many well know selection criteria belong. Moreover, this class enables us to derive a novel criterion, based on a divergence measure between the predictions produced by two nested models, called the Prediction Divergence Criterion (PDC). Our selection procedure is developed for linear regression models, but has the potential to be extended to other models. We demonstrate that, under some regularity conditions, it is asymptotically loss efficient and can also be consistent. In the linear case, the PDC is a counterpart to Mallow's Cp but with a lower asymptotic probability of overfitting. In a case study and by means of simulations, the PDC is shown to be particularly well suited in "sparse" settings with correlated covariates which we believe to be common in real applications.

A Prediction Divergence Criterion for Model Selection / Stephane Guerrier; Maria-Pia Victoria Feser. - ELETTRONICO. - (2015).

A Prediction Divergence Criterion for Model Selection

Maria-Pia Victoria Feser
2015

Abstract

The problem of model selection is inevitable in an increasingly large number of applications involving partial theoretical knowledge and vast amounts of information, like in medicine, biology or economics. The associated techniques are intended to determine which variables are "important" to "explain a phenomenon under investigation. The terms "important" and "explain" can have very different meanings according to the context and, in fact, model selection can be applied to any situation where one tries to balance variability with complexity. In this paper, we introduce a new class of error measures and of model selection criteria, to which many well know selection criteria belong. Moreover, this class enables us to derive a novel criterion, based on a divergence measure between the predictions produced by two nested models, called the Prediction Divergence Criterion (PDC). Our selection procedure is developed for linear regression models, but has the potential to be extended to other models. We demonstrate that, under some regularity conditions, it is asymptotically loss efficient and can also be consistent. In the linear case, the PDC is a counterpart to Mallow's Cp but with a lower asymptotic probability of overfitting. In a case study and by means of simulations, the PDC is shown to be particularly well suited in "sparse" settings with correlated covariates which we believe to be common in real applications.
2015
A Prediction Divergence Criterion for Model Selection / Stephane Guerrier; Maria-Pia Victoria Feser. - ELETTRONICO. - (2015).
Stephane Guerrier; Maria-Pia Victoria Feser
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/956906
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact