Accelerating Heterogeneous Federated Learning with Closed-form Classifiers

Eros Fanì; Raffaello Camoriano; Barbara Caputo; Marco Ciccone

Accelerating Heterogeneous Federated Learning with Closed-form Classifiers

Eros Fanì, Raffaello Camoriano, Barbara Caputo, Marco Ciccone

TL;DR

This work targets the slow convergence and instability of federated learning under strong statistical heterogeneity by introducing Fed3R, a closed-form Ridge Regression classifier trained on pre-trained features. Fed3R and its kernelized variant Fed3R-RF enable exact aggregation across clients with one-shot communication per client, achieving immunity to data heterogeneity and significantly reducing communication and computation costs. The authors extend Fed3R with a fine-tuning stage (Fed3R+FT), calibrating the softmax temperature to align loss landscapes, and demonstrate that fixing the classifier while fine-tuning features stabilizes training and improves feature quality. Across Landmarks, iNaturalist, and CIFAR-100, Fed3R variants converge faster and with lower resource usage than gradient-based baselines, and RR-based evaluation provides insight into feature extractors’ quality and the impact of initialization on downstream performance.

Abstract

Federated Learning (FL) methods often struggle in highly statistically heterogeneous settings. Indeed, non-IID data distributions cause client drift and biased local solutions, particularly pronounced in the final classification layer, negatively impacting convergence speed and accuracy. To address this issue, we introduce Federated Recursive Ridge Regression (Fed3R). Our method fits a Ridge Regression classifier computed in closed form leveraging pre-trained features. Fed3R is immune to statistical heterogeneity and is invariant to the sampling order of the clients. Therefore, it proves particularly effective in cross-device scenarios. Furthermore, it is fast and efficient in terms of communication and computation costs, requiring up to two orders of magnitude fewer resources than the competitors. Finally, we propose to leverage the Fed3R parameters as an initialization for a softmax classifier and subsequently fine-tune the model using any FL algorithm (Fed3R with Fine-Tuning, Fed3R+FT). Our findings also indicate that maintaining a fixed classifier aids in stabilizing the training and learning more discriminative features in cross-device settings. Official website: https://fed-3r.github.io/.

Accelerating Heterogeneous Federated Learning with Closed-form Classifiers

TL;DR

Abstract

Paper Structure (41 sections, 6 equations, 13 figures, 7 tables, 1 algorithm)

This paper contains 41 sections, 6 equations, 13 figures, 7 tables, 1 algorithm.

Introduction
Contributions
Related Works
Statistical heterogeneity in FL.
Optimization-based methods for heterogeneous FL.
Classifier bias and destructive interference.
Background
FL Problem Formulation
Closed-form Ridge Regression (RR)
Handling Non-linear Input Spaces in RR
Method
Federated Recursive Ridge Regression (Fed3R)
Fed3R with Random Features (Fed3R-RF)
Fed3R and Fed3R-RF Properties
Immunity to statistical heterogeneity.
...and 26 more sections

Figures (13)

Figure 1: Fed3R and Fed3R-RF invariance to different iNaturalist splits. All the curves converge to the same values, showing how both methods are immune to statistical heterogeneity.
Figure 2: Comparison between Fed3R and the baselines. From left to right: accuracy vs rounds, accuracy vs communication budget, accuracy vs average computation per client. Top row: Landmarks results, Bottom row: iNaturalist results. Fed3R shows clear advantages regarding convergence speed, communication, and computation budget required.
Figure 3: Accuracy vs Rounds with three different participation rates (indicated in the legend by $x$ cl/r, where cl/r stands for sampled clients per round) and two sampling strategies (without replacement for Fed3R and with replacement for FedAvg-LP, if not differently specified), iNaturalist dataset.
Figure 4: Comparison between Fed3R+FT and the baselines Landmarks dataset. At the convergence point of Fed3R, we substitute the parameters of the Fed3R classifier to the ones of the softmax and then use another algorithm for fine-tuning.
Figure 5: Comparison between Fed3R+FT in all its variants and the baselines, iNaturalist dataset.
...and 8 more figures

Accelerating Heterogeneous Federated Learning with Closed-form Classifiers

TL;DR

Abstract

Accelerating Heterogeneous Federated Learning with Closed-form Classifiers

Authors

TL;DR

Abstract

Table of Contents

Figures (13)