Accelerating Heterogeneous Federated Learning with Closed-form Classifiers
Eros Fanì, Raffaello Camoriano, Barbara Caputo, Marco Ciccone
TL;DR
This work targets the slow convergence and instability of federated learning under strong statistical heterogeneity by introducing Fed3R, a closed-form Ridge Regression classifier trained on pre-trained features. Fed3R and its kernelized variant Fed3R-RF enable exact aggregation across clients with one-shot communication per client, achieving immunity to data heterogeneity and significantly reducing communication and computation costs. The authors extend Fed3R with a fine-tuning stage (Fed3R+FT), calibrating the softmax temperature to align loss landscapes, and demonstrate that fixing the classifier while fine-tuning features stabilizes training and improves feature quality. Across Landmarks, iNaturalist, and CIFAR-100, Fed3R variants converge faster and with lower resource usage than gradient-based baselines, and RR-based evaluation provides insight into feature extractors’ quality and the impact of initialization on downstream performance.
Abstract
Federated Learning (FL) methods often struggle in highly statistically heterogeneous settings. Indeed, non-IID data distributions cause client drift and biased local solutions, particularly pronounced in the final classification layer, negatively impacting convergence speed and accuracy. To address this issue, we introduce Federated Recursive Ridge Regression (Fed3R). Our method fits a Ridge Regression classifier computed in closed form leveraging pre-trained features. Fed3R is immune to statistical heterogeneity and is invariant to the sampling order of the clients. Therefore, it proves particularly effective in cross-device scenarios. Furthermore, it is fast and efficient in terms of communication and computation costs, requiring up to two orders of magnitude fewer resources than the competitors. Finally, we propose to leverage the Fed3R parameters as an initialization for a softmax classifier and subsequently fine-tune the model using any FL algorithm (Fed3R with Fine-Tuning, Fed3R+FT). Our findings also indicate that maintaining a fixed classifier aids in stabilizing the training and learning more discriminative features in cross-device settings. Official website: https://fed-3r.github.io/.
