Variational Bayes for Federated Continual Learning

Dezhong Yao; Sanmu Li; Yutong Dai; Zhiqiang Xu; Shengshan Hu; Peilin Zhao; Lichao Sun

Variational Bayes for Federated Continual Learning

Dezhong Yao, Sanmu Li, Yutong Dai, Zhiqiang Xu, Shengshan Hu, Peilin Zhao, Lichao Sun

TL;DR

This work addresses Federated Continual Learning under non-stationary, privacy-constrained data by introducing FedBNN, a Federated Bayesian Neural Network trained via variational inference. It integrates historical and local distributions through history-aware local inference, local likelihood extraction, and global aggregation to form a coherent global posterior without requiring explicit task boundaries. A Prototype Library handles dynamic label spaces, while SNN-based initialization mitigates prior-driven forgetting during distribution drift. Experimental results across class- and task-incremental and gradual FCL settings show FedBNN achieves state-of-the-art forgetting mitigation and competitive adaptation, with uncertainty estimates enabling safer predictions in real-world deployments.

Abstract

Federated continual learning (FCL) has received increasing attention due to its potential in handling real-world streaming data, characterized by evolving data distributions and varying client classes over time. The constraints of storage limitations and privacy concerns confine local models to exclusively access the present data within each learning cycle. Consequently, this restriction induces performance degradation in model training on previous data, termed "catastrophic forgetting". However, existing FCL approaches need to identify or know changes in data distribution, which is difficult in the real world. To release these limitations, this paper directs attention to a broader continuous framework. Within this framework, we introduce Federated Bayesian Neural Network (FedBNN), a versatile and efficacious framework employing a variational Bayesian neural network across all clients. Our method continually integrates knowledge from local and historical data distributions into a single model, adeptly learning from new data distributions while retaining performance on historical distributions. We rigorously evaluate FedBNN's performance against prevalent methods in federated learning and continual learning using various metrics. Experimental analyses across diverse datasets demonstrate that FedBNN achieves state-of-the-art results in mitigating forgetting.

Variational Bayes for Federated Continual Learning

TL;DR

Abstract

Paper Structure (26 sections, 9 equations, 7 figures, 10 tables, 4 algorithms)

This paper contains 26 sections, 9 equations, 7 figures, 10 tables, 4 algorithms.

Introduction
Backgrounds
Related Work
Preliminary: BNN and VCL
Problem Formulation of FCL
Motivation: Real World FCL Scenarios
Problem Definition
The Proposed FedBNN Approach
Federated Bayesian Neural Network
Prototype Library for Dynamic Label Space
SNN Based Initialization
Discussion and Limitations
Experiments
Experimental Setup
Task-Separate Federated Continual Settings
...and 11 more sections

Figures (7)

Figure 1: In many real world scenarios, the data distribution of a federated learning system will evolve over time gong2022ode. Therefore, approaches that deal with dynamic data distribution are desirable for federated learning systems.
Figure 2: In real world FCL applications, data evolution on clients can exhibit different patterns. The figure demonstrates three typical cases of FCL data distribution. Difference of data distribution is demonstrated by different colors.
Figure 3: The mechanism of prototype library in FedBNN. Before local training, the classifier layer is assembled according to classes of current local data. The classifier layer is appended to the shared model for training. After training, the prototypes in the classifier layer are used to update the local prototype library. Local prototype libraries are sent and aggregated by class on the server, then sent back to clients.
Figure 4: On small datasets, the performance degradation of BNN is more significant than that of SNN after task switch, due to prior's effect of BNN.
Figure 5: Focus-On-Now (FON) accuracy curves of gradual distribution change settings, which denotes the model accuracy on current data distribution.
...and 2 more figures

Variational Bayes for Federated Continual Learning

TL;DR

Abstract

Variational Bayes for Federated Continual Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)