Accurate Forgetting for Heterogeneous Federated Continual Learning
Abudukelimu Wuerkaixi, Sen Cui, Jingfeng Zhang, Kunda Yan, Bo Han, Gang Niu, Lei Fang, Changshui Zhang, Masashi Sugiyama
TL;DR
This paper tackles federated continual learning (FCL) under highly heterogeneous, potentially unrelated task streams by introducing accurate forgetting, a concept that embraces selective forgetting of biased past knowledge. The proposed AF-FCL method uses a global normalizing flow to perform feature-space generative replay, paired with knowledge distillation to stabilize representations and a correlation-based mechanism to weight past knowledge by its relevance to the current task. By quantifying feature credibility via latent-space densities, AF-FCL suppresses harmful memories and reuses beneficial past information, achieving superior accuracy and lower forgetting across diverse benchmarks, including EMNIST variants, CIFAR100, and ImageNet-Subset. The work demonstrates practical implications for robust, privacy-preserving collaborative learning when client data are non-IID and task distributions diverge, showing that targeted forgetting can surpass traditional memorization-focused approaches.
Abstract
Recent years have witnessed a burgeoning interest in federated learning (FL). However, the contexts in which clients engage in sequential learning remain under-explored. Bridging FL and continual learning (CL) gives rise to a challenging practical problem: federated continual learning (FCL). Existing research in FCL primarily focuses on mitigating the catastrophic forgetting issue of continual learning while collaborating with other clients. We argue that the forgetting phenomena are not invariably detrimental. In this paper, we consider a more practical and challenging FCL setting characterized by potentially unrelated or even antagonistic data/tasks across different clients. In the FL scenario, statistical heterogeneity and data noise among clients may exhibit spurious correlations which result in biased feature learning. While existing CL strategies focus on a complete utilization of previous knowledge, we found that forgetting biased information is beneficial in our study. Therefore, we propose a new concept accurate forgetting (AF) and develop a novel generative-replay method~\method~which selectively utilizes previous knowledge in federated networks. We employ a probabilistic framework based on a normalizing flow model to quantify the credibility of previous knowledge. Comprehensive experiments affirm the superiority of our method over baselines.
