Towards Efficient Replay in Federated Incremental Learning

Yichen Li; Qunwei Li; Haozhao Wang; Ruixuan Li; Wenliang Zhong; Guannan Zhang

Towards Efficient Replay in Federated Incremental Learning

Yichen Li, Qunwei Li, Haozhao Wang, Ruixuan Li, Wenliang Zhong, Guannan Zhang

TL;DR

This work tackles catastrophic forgetting in Federated Incremental Learning (FIL) under data heterogeneity and storage constraints by introducing Re-Fed, a framework that coordinates cross-client replay through a Personalized Informative Model (PIM). Re-Fed computes sample importance via gradient norms during PIM updates and selects high-importance past samples to cache under a memory budget $M$, enabling the local model to learn from both the new task and replay data. A balancing parameter $\lambda$ regulates the mix of local and global information in the PIM, with convergence guarantees provided for the personalized component. Empirically, Re-Fed outperforms state-of-the-art baselines by up to 19.73% in final accuracy across Class-Incremental and Domain-Incremental FIL tasks on multiple datasets, while maintaining privacy and communication efficiency.

Abstract

In Federated Learning (FL), the data in each client is typically assumed fixed or static. However, data often comes in an incremental manner in real-world applications, where the data domain may increase dynamically. In this work, we study catastrophic forgetting with data heterogeneity in Federated Incremental Learning (FIL) scenarios where edge clients may lack enough storage space to retain full data. We propose to employ a simple, generic framework for FIL named Re-Fed, which can coordinate each client to cache important samples for replay. More specifically, when a new task arrives, each client first caches selected previous samples based on their global and local importance. Then, the client trains the local model with both the cached samples and the samples from the new task. Theoretically, we analyze the ability of Re-Fed to discover important samples for replay thus alleviating the catastrophic forgetting problem. Moreover, we empirically show that Re-Fed achieves competitive performance compared to state-of-the-art methods.

Towards Efficient Replay in Federated Incremental Learning

TL;DR

, enabling the local model to learn from both the new task and replay data. A balancing parameter

regulates the mix of local and global information in the PIM, with convergence guarantees provided for the personalized component. Empirically, Re-Fed outperforms state-of-the-art baselines by up to 19.73% in final accuracy across Class-Incremental and Domain-Incremental FIL tasks on multiple datasets, while maintaining privacy and communication efficiency.

Abstract

Paper Structure (19 sections, 1 theorem, 16 equations, 5 figures, 10 tables, 1 algorithm)

This paper contains 19 sections, 1 theorem, 16 equations, 5 figures, 10 tables, 1 algorithm.

Introduction
Related Work
Methodology
Problem Formulation
Re-Fed: Framework for FIL
Experiments
Experiment Setup
Performance Overview
Conclusion and Future Work
Dataset
Baseline
Configurations
Detailed Re-Fed Framework with FedAvg
Experimental Results
Detailed Results of Test Accuracy.
...and 4 more sections

Key Result

Theorem 3.1

Assuming that the global model $w^t$ converges to the optimal model $\hat{w}$ at communication round $t$ by $g(t)$ as: $\mathbb{E}[||w^t-\hat{w}||^2] \leq g(t)$, $\lim_{t\rightarrow\infty}g(t)=0$ and $g(t+1)\le g(t)$, there exists a constant $C < \infty$ such that for any client $k \in [K]$ the pers

Figures (5)

Figure 1: The motivation for our method: an example of 3-client in FIL scenario. When a new task arrives, each client needs to cache previous samples with limited storage for replay, alleviating catastrophic forgetting. Global caching represents all the samples cached by all clients collectively. With a naive caching method, the client may ignore the sample's correlation across clients which increases the statistical data heterogeneity in global caching. With a desirable method, the client tends to cache samples which both considers the distribution of local samples and reduces the statistical data heterogeneity in the global caching.
Figure 2: Illustration of the Re-Fed framework. When a new task arrives, each client first updates the personalized informative model on previous local samples with the distributed global model. Then, samples are selected to be cached by the sample importance scores that are calculated with the updated personalized informative model. Finally, each client trains the local model with both the new task and cached previous samples.
Figure 3: Performance w.r.t data heterogeneity $\alpha$ for three datasets.
Figure 4: Performance w.r.t number of incremental tasks $n$ for three class-incremental datasets.
Figure 5: Performance of Re-Fed under different configurations (a) local training epoch $E$, (b) sample size $B$ in the classifier, (c) client selection ratio $r$ of all clients on CIFAR100 with $\alpha$ = 5.0.

Theorems & Definitions (1)

Theorem 3.1: Convergence of PIM

Towards Efficient Replay in Federated Incremental Learning

TL;DR

Abstract

Towards Efficient Replay in Federated Incremental Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (1)