In the last two decades, significant research efforts have been dedicated to addressing the issue of spatial confounding in linear regression models. Confounding occurs when the relationship between the covariate and the response variable is influenced by an unmeasured confounder associated with both. This results in biased estimators for the regression coefficients reduced efficiency, and misleading interpretations. This article aims to understand how confounding relates to the parameters of the data generating process. The sampling properties of the regression coefficient estimator are derived as ratios of dependent quadratic forms in Gaussian random variables: this allows us to obtain exact expressions for the marginal bias and variance of the estimator, that were not obtained in previous studies. Moreover, we provide an approximate measure of the marginal bias that gives insights of the main determinants of bias. Applications in the framework of geostatistical and areal data modeling are presented. Particular attention is devoted to the difference between smoothness and variability of random vectors involved in the data generating process. Results indicate that marginal covariance between the covariate and the confounder, along with marginal variability of the covariate, play the most relevant role in determining the magnitude of confounding, as measured by the bias.
martina narcisi, fedele greco, carlo trivisano (2024). On the effect of confounding in linear regression models: an approach based on the theory of quadratic forms. ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 31, 433-461 [10.1007/s10651-024-00604-y].
On the effect of confounding in linear regression models: an approach based on the theory of quadratic forms
martina narcisi
;fedele greco;carlo trivisano
2024
Abstract
In the last two decades, significant research efforts have been dedicated to addressing the issue of spatial confounding in linear regression models. Confounding occurs when the relationship between the covariate and the response variable is influenced by an unmeasured confounder associated with both. This results in biased estimators for the regression coefficients reduced efficiency, and misleading interpretations. This article aims to understand how confounding relates to the parameters of the data generating process. The sampling properties of the regression coefficient estimator are derived as ratios of dependent quadratic forms in Gaussian random variables: this allows us to obtain exact expressions for the marginal bias and variance of the estimator, that were not obtained in previous studies. Moreover, we provide an approximate measure of the marginal bias that gives insights of the main determinants of bias. Applications in the framework of geostatistical and areal data modeling are presented. Particular attention is devoted to the difference between smoothness and variability of random vectors involved in the data generating process. Results indicate that marginal covariance between the covariate and the confounder, along with marginal variability of the covariate, play the most relevant role in determining the magnitude of confounding, as measured by the bias.File | Dimensione | Formato | |
---|---|---|---|
s10651-024-00604-y.pdf
accesso aperto
Tipo:
Versione (PDF) editoriale
Licenza:
Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY)
Dimensione
1.97 MB
Formato
Adobe PDF
|
1.97 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.