Table of Contents
Fetching ...

Accelerating Heterogeneous Federated Learning with Closed-form Classifiers

Eros Fanì, Raffaello Camoriano, Barbara Caputo, Marco Ciccone

TL;DR

This work targets the slow convergence and instability of federated learning under strong statistical heterogeneity by introducing Fed3R, a closed-form Ridge Regression classifier trained on pre-trained features. Fed3R and its kernelized variant Fed3R-RF enable exact aggregation across clients with one-shot communication per client, achieving immunity to data heterogeneity and significantly reducing communication and computation costs. The authors extend Fed3R with a fine-tuning stage (Fed3R+FT), calibrating the softmax temperature to align loss landscapes, and demonstrate that fixing the classifier while fine-tuning features stabilizes training and improves feature quality. Across Landmarks, iNaturalist, and CIFAR-100, Fed3R variants converge faster and with lower resource usage than gradient-based baselines, and RR-based evaluation provides insight into feature extractors’ quality and the impact of initialization on downstream performance.

Abstract

Federated Learning (FL) methods often struggle in highly statistically heterogeneous settings. Indeed, non-IID data distributions cause client drift and biased local solutions, particularly pronounced in the final classification layer, negatively impacting convergence speed and accuracy. To address this issue, we introduce Federated Recursive Ridge Regression (Fed3R). Our method fits a Ridge Regression classifier computed in closed form leveraging pre-trained features. Fed3R is immune to statistical heterogeneity and is invariant to the sampling order of the clients. Therefore, it proves particularly effective in cross-device scenarios. Furthermore, it is fast and efficient in terms of communication and computation costs, requiring up to two orders of magnitude fewer resources than the competitors. Finally, we propose to leverage the Fed3R parameters as an initialization for a softmax classifier and subsequently fine-tune the model using any FL algorithm (Fed3R with Fine-Tuning, Fed3R+FT). Our findings also indicate that maintaining a fixed classifier aids in stabilizing the training and learning more discriminative features in cross-device settings. Official website: https://fed-3r.github.io/.

Accelerating Heterogeneous Federated Learning with Closed-form Classifiers

TL;DR

This work targets the slow convergence and instability of federated learning under strong statistical heterogeneity by introducing Fed3R, a closed-form Ridge Regression classifier trained on pre-trained features. Fed3R and its kernelized variant Fed3R-RF enable exact aggregation across clients with one-shot communication per client, achieving immunity to data heterogeneity and significantly reducing communication and computation costs. The authors extend Fed3R with a fine-tuning stage (Fed3R+FT), calibrating the softmax temperature to align loss landscapes, and demonstrate that fixing the classifier while fine-tuning features stabilizes training and improves feature quality. Across Landmarks, iNaturalist, and CIFAR-100, Fed3R variants converge faster and with lower resource usage than gradient-based baselines, and RR-based evaluation provides insight into feature extractors’ quality and the impact of initialization on downstream performance.

Abstract

Federated Learning (FL) methods often struggle in highly statistically heterogeneous settings. Indeed, non-IID data distributions cause client drift and biased local solutions, particularly pronounced in the final classification layer, negatively impacting convergence speed and accuracy. To address this issue, we introduce Federated Recursive Ridge Regression (Fed3R). Our method fits a Ridge Regression classifier computed in closed form leveraging pre-trained features. Fed3R is immune to statistical heterogeneity and is invariant to the sampling order of the clients. Therefore, it proves particularly effective in cross-device scenarios. Furthermore, it is fast and efficient in terms of communication and computation costs, requiring up to two orders of magnitude fewer resources than the competitors. Finally, we propose to leverage the Fed3R parameters as an initialization for a softmax classifier and subsequently fine-tune the model using any FL algorithm (Fed3R with Fine-Tuning, Fed3R+FT). Our findings also indicate that maintaining a fixed classifier aids in stabilizing the training and learning more discriminative features in cross-device settings. Official website: https://fed-3r.github.io/.
Paper Structure (41 sections, 6 equations, 13 figures, 7 tables, 1 algorithm)

This paper contains 41 sections, 6 equations, 13 figures, 7 tables, 1 algorithm.

Figures (13)

  • Figure 1: Fed3R and Fed3R-RF invariance to different iNaturalist splits. All the curves converge to the same values, showing how both methods are immune to statistical heterogeneity.
  • Figure 2: Comparison between Fed3R and the baselines. From left to right: accuracy vs rounds, accuracy vs communication budget, accuracy vs average computation per client. Top row: Landmarks results, Bottom row: iNaturalist results. Fed3R shows clear advantages regarding convergence speed, communication, and computation budget required.
  • Figure 3: Accuracy vs Rounds with three different participation rates (indicated in the legend by $x$ cl/r, where cl/r stands for sampled clients per round) and two sampling strategies (without replacement for Fed3R and with replacement for FedAvg-LP, if not differently specified), iNaturalist dataset.
  • Figure 4: Comparison between Fed3R+FT and the baselines Landmarks dataset. At the convergence point of Fed3R, we substitute the parameters of the Fed3R classifier to the ones of the softmax and then use another algorithm for fine-tuning.
  • Figure 5: Comparison between Fed3R+FT in all its variants and the baselines, iNaturalist dataset.
  • ...and 8 more figures