While classical in many theoretical settings-and in particular in statistical physics-inspired works-the assumption of Gaussian i.i.d. input data is often perceived as a strong limitation in the context of statistics and machine learning. In this study, we redeem this line of work in the case of generalized linear classification, also known as the perceptron model, with random labels. We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance. In the limit of vanishing regularization, we further demonstrate that the training loss is independent of the data covariance. On the theoretical side, we prove this universality for an arbitrary mixture of homogeneous Gaussian clouds. Empirically, we show that the universality holds also for a broad range of real data sets.

Gerace, F., Krzakala, F., Loureiro, B., Stephan, L., Zdeborová, L. (2024). Gaussian universality of perceptrons with random labels. PHYSICAL REVIEW. E, 109(3), 1-18 [10.1103/physreve.109.034305].

Gaussian universality of perceptrons with random labels

Gerace, Federica;
2024

Abstract

While classical in many theoretical settings-and in particular in statistical physics-inspired works-the assumption of Gaussian i.i.d. input data is often perceived as a strong limitation in the context of statistics and machine learning. In this study, we redeem this line of work in the case of generalized linear classification, also known as the perceptron model, with random labels. We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance. In the limit of vanishing regularization, we further demonstrate that the training loss is independent of the data covariance. On the theoretical side, we prove this universality for an arbitrary mixture of homogeneous Gaussian clouds. Empirically, we show that the universality holds also for a broad range of real data sets.
2024
Gerace, F., Krzakala, F., Loureiro, B., Stephan, L., Zdeborová, L. (2024). Gaussian universality of perceptrons with random labels. PHYSICAL REVIEW. E, 109(3), 1-18 [10.1103/physreve.109.034305].
Gerace, Federica; Krzakala, Florent; Loureiro, Bruno; Stephan, Ludovic; Zdeborová, Lenka
File in questo prodotto:
File Dimensione Formato  
PhysRevE.109.034305.pdf

accesso aperto

Tipo: Versione (PDF) editoriale
Licenza: Licenza per accesso libero gratuito
Dimensione 2.25 MB
Formato Adobe PDF
2.25 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/969585
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact