In the context of Global Health, massive administrative datasets have become indispensable tools for health surveillance. However, the sheer scale of Big Data can mask systemic selection biases that standard mathematical adjustments may not fully mitigate. In this study, I propose a methodological audit of a recent large-scale cohort (N = 2,975,035) concerning COVID-19 vaccination and oncological outcomes. By benchmarking the cohort's architecture against national demographic and epidemiological gold standards through single-proportion Z-tests, we identified notable structural divergences. The first inferential test yielded a Z-score of -260.39 (p < 10-50), suggesting a structural under-sampling of the elderly population (32.2% deficit) relative to the reference population. The second test identified a statistically inconsistent cancer incidence deficit in the non-vaccinated control group (Z = -15.23, p < 10-50). These findings indicate that the reported statistical signals may emerge as a computational consequence of structural selection bias, where an artificially deflated baseline in the control group potentially inflates Hazard Ratios. Within a One Health approach, ensuring the structural integrity of data is crucial for effective prevention and control measures. We conclude that large-scale surveillance studies could be inferentially validated against demographic benchmarks to ensure that public health conclusions are grounded in baseline equivalence, thereby safeguarding the reliability of global health monitoring.

Roccetti, M. (2026). Enhancing public health surveillance: A statistical validation of potential sampling bias in large retrospective vaccine cohorts. AIMS PUBLIC HEALTH, 13(2), 589-597 [10.3934/publichealth.2026031].

Enhancing public health surveillance: A statistical validation of potential sampling bias in large retrospective vaccine cohorts

Roccetti, Marco
2026

Abstract

In the context of Global Health, massive administrative datasets have become indispensable tools for health surveillance. However, the sheer scale of Big Data can mask systemic selection biases that standard mathematical adjustments may not fully mitigate. In this study, I propose a methodological audit of a recent large-scale cohort (N = 2,975,035) concerning COVID-19 vaccination and oncological outcomes. By benchmarking the cohort's architecture against national demographic and epidemiological gold standards through single-proportion Z-tests, we identified notable structural divergences. The first inferential test yielded a Z-score of -260.39 (p < 10-50), suggesting a structural under-sampling of the elderly population (32.2% deficit) relative to the reference population. The second test identified a statistically inconsistent cancer incidence deficit in the non-vaccinated control group (Z = -15.23, p < 10-50). These findings indicate that the reported statistical signals may emerge as a computational consequence of structural selection bias, where an artificially deflated baseline in the control group potentially inflates Hazard Ratios. Within a One Health approach, ensuring the structural integrity of data is crucial for effective prevention and control measures. We conclude that large-scale surveillance studies could be inferentially validated against demographic benchmarks to ensure that public health conclusions are grounded in baseline equivalence, thereby safeguarding the reliability of global health monitoring.
2026
Roccetti, M. (2026). Enhancing public health surveillance: A statistical validation of potential sampling bias in large retrospective vaccine cohorts. AIMS PUBLIC HEALTH, 13(2), 589-597 [10.3934/publichealth.2026031].
Roccetti, Marco
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1065571
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact