CRIS Current Research Information System

In the context of Global Health, massive administrative datasets have become indispensable tools for health surveillance. However, the sheer scale of Big Data can mask systemic selection biases that standard mathematical adjustments may not fully mitigate. In this study, I propose a methodological audit of a recent large-scale cohort (N = 2,975,035) concerning COVID-19 vaccination and oncological outcomes. By benchmarking the cohort's architecture against national demographic and epidemiological gold standards through single-proportion Z-tests, we identified notable structural divergences. The first inferential test yielded a Z-score of -260.39 (p < 10-50), suggesting a structural under-sampling of the elderly population (32.2% deficit) relative to the reference population. The second test identified a statistically inconsistent cancer incidence deficit in the non-vaccinated control group (Z = -15.23, p < 10-50). These findings indicate that the reported statistical signals may emerge as a computational consequence of structural selection bias, where an artificially deflated baseline in the control group potentially inflates Hazard Ratios. Within a One Health approach, ensuring the structural integrity of data is crucial for effective prevention and control measures. We conclude that large-scale surveillance studies could be inferentially validated against demographic benchmarks to ensure that public health conclusions are grounded in baseline equivalence, thereby safeguarding the reliability of global health monitoring.

Roccetti, M. (2026). Enhancing public health surveillance: A statistical validation of potential sampling bias in large retrospective vaccine cohorts. AIMS PUBLIC HEALTH, 13(2), 589-597 [10.3934/publichealth.2026031].

Enhancing public health surveillance: A statistical validation of potential sampling bias in large retrospective vaccine cohorts

Roccetti, Marco

2026

Abstract

In the context of Global Health, massive administrative datasets have become indispensable tools for health surveillance. However, the sheer scale of Big Data can mask systemic selection biases that standard mathematical adjustments may not fully mitigate. In this study, I propose a methodological audit of a recent large-scale cohort (N = 2,975,035) concerning COVID-19 vaccination and oncological outcomes. By benchmarking the cohort's architecture against national demographic and epidemiological gold standards through single-proportion Z-tests, we identified notable structural divergences. The first inferential test yielded a Z-score of -260.39 (p < 10-50), suggesting a structural under-sampling of the elderly population (32.2% deficit) relative to the reference population. The second test identified a statistically inconsistent cancer incidence deficit in the non-vaccinated control group (Z = -15.23, p < 10-50). These findings indicate that the reported statistical signals may emerge as a computational consequence of structural selection bias, where an artificially deflated baseline in the control group potentially inflates Hazard Ratios. Within a One Health approach, ensuring the structural integrity of data is crucial for effective prevention and control measures. We conclude that large-scale surveillance studies could be inferentially validated against demographic benchmarks to ensure that public health conclusions are grounded in baseline equivalence, thereby safeguarding the reliability of global health monitoring.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Rivista
	
				AIMS PUBLIC HEALTH
			
	Codice DOI
	
				https://dx.doi.org/10.3934/publichealth.2026031
			
	Citazione
	
				Roccetti, M. (2026). Enhancing public health surveillance: A statistical validation of potential sampling bias in large retrospective vaccine cohorts. AIMS PUBLIC HEALTH, 13(2), 589-597 [10.3934/publichealth.2026031].
			
	Tutti gli autori
	
						Roccetti, Marco
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
10.3934_publichealth.2026031.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 437.01 kB Formato Adobe PDF Visualizza/Apri	437.01 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1065571

Citazioni

ND

0

0

0

social impact