CRIS Current Research Information System

From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice. Understanding the statistical fluctuations engendered by the different sources of randomness in prediction is therefore key to understanding robust generalisation. In this manuscript we develop a quantitative and rigorous theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features in high-dimensions. In particular, we provide a complete description of the asymptotic joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit. Our result encompasses a rich set of classification and regression tasks, such as the lazy regime of overparametrised neural networks, or equivalently the random features approximation of kernels. While allowing to study directly the mitigating effect of ensembling (or bagging) on the bias-variance decomposition of the test error, our analysis also helps disentangle the contribution of statistical fluctuations, and the singular role played by the interpolation threshold that are at the roots of the 'double-descent' phenomenon.

Loureiro B., Gerbelot C., Refinetti M., Sicuro G., Krzakala F. (2023). Fluctuations, bias, variance and ensemble of learners: exact asymptotics for convex losses in high-dimension. JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT, 2023(11), 1-50 [10.1088/1742-5468/ad0221].

Fluctuations, bias, variance and ensemble of learners: exact asymptotics for convex losses in high-dimension

Loureiro B.;Gerbelot C.;Refinetti M.;Sicuro G.;Krzakala F.

2023

Abstract

From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice. Understanding the statistical fluctuations engendered by the different sources of randomness in prediction is therefore key to understanding robust generalisation. In this manuscript we develop a quantitative and rigorous theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features in high-dimensions. In particular, we provide a complete description of the asymptotic joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit. Our result encompasses a rich set of classification and regression tasks, such as the lazy regime of overparametrised neural networks, or equivalently the random features approximation of kernels. While allowing to study directly the mitigating effect of ensembling (or bagging) on the bias-variance decomposition of the test error, our analysis also helps disentangle the contribution of statistical fluctuations, and the singular role played by the interpolation threshold that are at the roots of the 'double-descent' phenomenon.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Rivista
	
				JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT
			
	Codice DOI
	
				https://dx.doi.org/10.1088/1742-5468/ad0221
			
	Citazione
	
				Loureiro B.,  Gerbelot C.,  Refinetti M.,  Sicuro G.,  Krzakala F. (2023). Fluctuations, bias, variance and ensemble of learners: exact asymptotics for convex losses in high-dimension. JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT, 2023(11), 1-50 [10.1088/1742-5468/ad0221].
			
	Tutti gli autori
	
						Loureiro B.; Gerbelot C.; Refinetti M.; Sicuro G.; Krzakala F.
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Loureiro_2023_J._Stat._Mech._2023_114001.pdf accesso aperto Tipo: Versione (PDF) editoriale / Version Of Record Licenza: Licenza per Accesso Aperto. Creative Commons Attribuzione (CCBY) Dimensione 831.44 kB Formato Adobe PDF Visualizza/Apri	831.44 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/965236

Citazioni

ND

4

1

social impact