CRIS Current Research Information System

We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model. We consider the high-dimensional regime and using the replica method from statistical physics, we provide a closed-form expression for the asymptotic generalisation performance in these problems, valid in both the under- and over-parametrised regimes and for a broad choice of generalised linear model loss functions. In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model. Beyond the interest in these particular problems, the theoretical formalism introduced in this manuscript provides a path to further extensions to more complex tasks.

Gerace, F., Loureiro, B., Krzakala, F., Mézard, M., Zdeborová, L. (2021). Generalisation error in learning with random features and the hidden manifold model. JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT, 2021(12), 1-40 [10.1088/1742-5468/ac3ae6].

Generalisation error in learning with random features and the hidden manifold model

Gerace, Federica;Loureiro, Bruno;Krzakala, Florent;Mézard, Marc;Zdeborová, Lenka

2021

Abstract

We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model. We consider the high-dimensional regime and using the replica method from statistical physics, we provide a closed-form expression for the asymptotic generalisation performance in these problems, valid in both the under- and over-parametrised regimes and for a broad choice of generalised linear model loss functions. In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model. Beyond the interest in these particular problems, the theoretical formalism introduced in this manuscript provides a path to further extensions to more complex tasks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Rivista
	
				JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT
			
	Codice DOI
	
				https://dx.doi.org/10.1088/1742-5468/ac3ae6
			
	Citazione
	
				Gerace, F., Loureiro, B., Krzakala, F., Mézard, M., Zdeborová, L. (2021). Generalisation error in learning with random features and the hidden manifold model. JOURNAL OF STATISTICAL MECHANICS: THEORY AND EXPERIMENT, 2021(12), 1-40 [10.1088/1742-5468/ac3ae6].
			
	Tutti gli autori
	
						Gerace, Federica; Loureiro, Bruno; Krzakala, Florent; Mézard, Marc; Zdeborová, Lenka
					
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
2002.09339v2.pdf accesso aperto Tipo: Postprint Licenza: Licenza per accesso libero gratuito Dimensione 1.34 MB Formato Adobe PDF Visualizza/Apri	1.34 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/969584

Citazioni

ND

11

39

social impact