Table of Contents
Fetching ...

Optimisation of federated learning settings under statistical heterogeneity variations

Basem Suleiman, Muhammad Johan Alibasa, Rizka Widyarini Purwanto, Lewis Jeffries, Ali Anaissi, Jacky Song

TL;DR

The paper addresses how statistical heterogeneity impacts federated learning performance and introduces a data-partitioning framework to simulate varying IID conditions. It empirically evaluates four aggregators—FedAvg, FedProx, FedPer, and SCAFFOLD—across MNIST, CIFAR-10, and Healthcare Obesity datasets, using Earth Mover’s Distance ($EMD$) to quantify heterogeneity. Key contributions include a reproducible IID-measurement approach, dataset- and IID-specific recommendations for aggregation and hyperparameters, and a practical flow-chart guiding researchers to optimize FL under heterogeneity. The findings show no single aggregator dominates across all settings; guidelines emphasize measuring IID via $EMD$, tuning active devices, local epochs, and batch size according to dataset characteristics, with significant implications for reproducible FL benchmarking and deployment.

Abstract

Federated Learning (FL) enables local devices to collaboratively learn a shared predictive model by only periodically sharing model parameters with a central aggregator. However, FL can be disadvantaged by statistical heterogeneity produced by the diversity in each local devices data distribution, which creates different levels of Independent and Identically Distributed (IID) data. Furthermore, this can be more complex when optimising different combinations of FL parameters and choosing optimal aggregation. In this paper, we present an empirical analysis of different FL training parameters and aggregators over various levels of statistical heterogeneity on three datasets. We propose a systematic data partition strategy to simulate different levels of statistical heterogeneity and a metric to measure the level of IID. Additionally, we empirically identify the best FL model and key parameters for datasets of different characteristics. On the basis of these, we present recommended guidelines for FL parameters and aggregators to optimise model performance under different levels of IID and with different datasets

Optimisation of federated learning settings under statistical heterogeneity variations

TL;DR

The paper addresses how statistical heterogeneity impacts federated learning performance and introduces a data-partitioning framework to simulate varying IID conditions. It empirically evaluates four aggregators—FedAvg, FedProx, FedPer, and SCAFFOLD—across MNIST, CIFAR-10, and Healthcare Obesity datasets, using Earth Mover’s Distance () to quantify heterogeneity. Key contributions include a reproducible IID-measurement approach, dataset- and IID-specific recommendations for aggregation and hyperparameters, and a practical flow-chart guiding researchers to optimize FL under heterogeneity. The findings show no single aggregator dominates across all settings; guidelines emphasize measuring IID via , tuning active devices, local epochs, and batch size according to dataset characteristics, with significant implications for reproducible FL benchmarking and deployment.

Abstract

Federated Learning (FL) enables local devices to collaboratively learn a shared predictive model by only periodically sharing model parameters with a central aggregator. However, FL can be disadvantaged by statistical heterogeneity produced by the diversity in each local devices data distribution, which creates different levels of Independent and Identically Distributed (IID) data. Furthermore, this can be more complex when optimising different combinations of FL parameters and choosing optimal aggregation. In this paper, we present an empirical analysis of different FL training parameters and aggregators over various levels of statistical heterogeneity on three datasets. We propose a systematic data partition strategy to simulate different levels of statistical heterogeneity and a metric to measure the level of IID. Additionally, we empirically identify the best FL model and key parameters for datasets of different characteristics. On the basis of these, we present recommended guidelines for FL parameters and aggregators to optimise model performance under different levels of IID and with different datasets
Paper Structure (32 sections, 17 figures, 13 tables)

This paper contains 32 sections, 17 figures, 13 tables.

Figures (17)

  • Figure 1: Label distribution in the Health Care Obesity dataset
  • Figure 2: Examples of high-IID and low-IID label distribution allocations
  • Figure 3: Examples of low and high variance data quantity distribution allocations
  • Figure 4: EMD analysis using MNIST
  • Figure 5: EMD analysis using Healthcare Obesity dataset
  • ...and 12 more figures