BAFFLE: A Baseline of Backpropagation-Free Federated Learning
Haozhe Feng, Tianyu Pang, Chao Du, Wei Chen, Shuicheng Yan, Min Lin
TL;DR
The paper tackles the practical bottleneck of backpropagation in federated learning on edge devices by introducing BAFFLE, a backpropagation-free FL framework that uses zero-order gradient estimation from forward passes. BAFFLE relies on Gaussian perturbations of the global model and Stein's identity to obtain an unbiased gradient surrogate, which is communicated as a vector of loss differences and securely aggregated across clients. The authors provide convergence guarantees showing an unbiased estimator with a rate of $\mathcal{O}(\sqrt{n/K})$, and demonstrate empirical viability on MNIST, CIFAR-10/100, and OfficeHome with memory and bandwidth efficiency, TEEs compatibility, and robustness considerations. While BAFFLE incurs some accuracy trade-offs, especially under severe non-IID distributions, it offers a promising path for privacy-preserving, resource-constrained FL where backpropagation is impractical.
Abstract
Federated learning (FL) is a general principle for decentralized clients to train a server model collectively without sharing local data. FL is a promising framework with practical applications, but its standard training paradigm requires the clients to backpropagate through the model to compute gradients. Since these clients are typically edge devices and not fully trusted, executing backpropagation on them incurs computational and storage overhead as well as white-box vulnerability. In light of this, we develop backpropagation-free federated learning, dubbed BAFFLE, in which backpropagation is replaced by multiple forward processes to estimate gradients. BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments, because the clients in BAFFLE only execute forward propagation and return a set of scalars to the server. Empirically we use BAFFLE to train deep models from scratch or to finetune pretrained models, achieving acceptable results. Code is available in https://github.com/FengHZ/BAFFLE.
