Federated Learning (FL) has emerged as a key paradigm in machine learning but its performance often deteriorates under non-independent and identically distributed (non-IID) client data. Such heterogeneity frequently reflects geographic factors—for example, regional linguistic variations or localized traffic patterns—leading to IID data within regions but with non-IID distributions across them. However, existing FL algorithms are typically evaluated by randomly splitting non-IID data across devices, disregarding their spatial distribution. To address this gap, we introduce PROFED, a benchmark that simulates data splits with varying degrees of skewness across different regions. We incorporate several skewness methods from the literature and apply them to well-known datasets, including MNIST, FashionMNIST, Extended MNIST, CIFAR-10, CIFAR-100, and UTKFace. Our goal is to provide researchers with a standardized framework to evaluate FL algorithms more effectively and consistently against established baselines.

Domini, D., Ingemann, C.O., Aguzzi, G., Esterle, L., Viroli, M. (2026). ProFed: A Benchmark for Proximity-Based Non-IID Federated Learning. JOURNAL OF OPEN RESEARCH SOFTWARE, 14, 1-13 [10.5334/jors.624].

ProFed: A Benchmark for Proximity-Based Non-IID Federated Learning

Domini, Davide;Aguzzi, Gianluca;Esterle, Lukas;Viroli, Mirko
2026

Abstract

Federated Learning (FL) has emerged as a key paradigm in machine learning but its performance often deteriorates under non-independent and identically distributed (non-IID) client data. Such heterogeneity frequently reflects geographic factors—for example, regional linguistic variations or localized traffic patterns—leading to IID data within regions but with non-IID distributions across them. However, existing FL algorithms are typically evaluated by randomly splitting non-IID data across devices, disregarding their spatial distribution. To address this gap, we introduce PROFED, a benchmark that simulates data splits with varying degrees of skewness across different regions. We incorporate several skewness methods from the literature and apply them to well-known datasets, including MNIST, FashionMNIST, Extended MNIST, CIFAR-10, CIFAR-100, and UTKFace. Our goal is to provide researchers with a standardized framework to evaluate FL algorithms more effectively and consistently against established baselines.
2026
Domini, D., Ingemann, C.O., Aguzzi, G., Esterle, L., Viroli, M. (2026). ProFed: A Benchmark for Proximity-Based Non-IID Federated Learning. JOURNAL OF OPEN RESEARCH SOFTWARE, 14, 1-13 [10.5334/jors.624].
Domini, Davide; Ingemann, Christian Otte; Aguzzi, Gianluca; Esterle, Lukas; Viroli, Mirko
File in questo prodotto:
Eventuali allegati, non sono esposti

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11585/1052410
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact