Table of Contents
Fetching ...

Task-Agnostic Federated Continual Learning via Replay-Free Gradient Projection

Seohyeon Cha, Huancheng Chen, Haris Vikalo

TL;DR

FedProTIP tackles federated continual learning under task-agnostic inference by combining gradient projection with a memory of subspace bases and a lightweight task-identity predictor. Each client performs PGD to avoid erasing past-task features, extracts compact local core bases via randomized SVD, and contributes them to a global feature subspace that guides future updates. Inference relies on subspace relevance to predict the current task and route outputs, enabling dynamic head selection without replay, generative models, or labeled task IDs. Across CIFAR100, DomainNet, and ImageNet-R, FedProTIP achieves superior average accuracy and reduced forgetting, while markedly improving communication and computation efficiency relative to state-of-the-art FCL methods, making it practical for privacy-conscious, heterogeneous FL deployments.

Abstract

Federated continual learning (FCL) enables distributed client devices to learn from streaming data across diverse and evolving tasks. A major challenge to continual learning, catastrophic forgetting, is exacerbated in decentralized settings by the data heterogeneity, constrained communication and privacy concerns. We propose Federated gradient Projection-based Continual Learning with Task Identity Prediction (FedProTIP), a novel FCL framework that mitigates forgetting by projecting client updates onto the orthogonal complement of the subspace spanned by previously learned representations of the global model. This projection reduces interference with earlier tasks and preserves performance across the task sequence. To further address the challenge of task-agnostic inference, we incorporate a lightweight mechanism that leverages core bases from prior tasks to predict task identity and dynamically adjust the global model's outputs. Extensive experiments across standard FCL benchmarks demonstrate that FedProTIP significantly outperforms state-of-the-art methods in average accuracy, particularly in settings where task identities are a priori unknown.

Task-Agnostic Federated Continual Learning via Replay-Free Gradient Projection

TL;DR

FedProTIP tackles federated continual learning under task-agnostic inference by combining gradient projection with a memory of subspace bases and a lightweight task-identity predictor. Each client performs PGD to avoid erasing past-task features, extracts compact local core bases via randomized SVD, and contributes them to a global feature subspace that guides future updates. Inference relies on subspace relevance to predict the current task and route outputs, enabling dynamic head selection without replay, generative models, or labeled task IDs. Across CIFAR100, DomainNet, and ImageNet-R, FedProTIP achieves superior average accuracy and reduced forgetting, while markedly improving communication and computation efficiency relative to state-of-the-art FCL methods, making it practical for privacy-conscious, heterogeneous FL deployments.

Abstract

Federated continual learning (FCL) enables distributed client devices to learn from streaming data across diverse and evolving tasks. A major challenge to continual learning, catastrophic forgetting, is exacerbated in decentralized settings by the data heterogeneity, constrained communication and privacy concerns. We propose Federated gradient Projection-based Continual Learning with Task Identity Prediction (FedProTIP), a novel FCL framework that mitigates forgetting by projecting client updates onto the orthogonal complement of the subspace spanned by previously learned representations of the global model. This projection reduces interference with earlier tasks and preserves performance across the task sequence. To further address the challenge of task-agnostic inference, we incorporate a lightweight mechanism that leverages core bases from prior tasks to predict task identity and dynamically adjust the global model's outputs. Extensive experiments across standard FCL benchmarks demonstrate that FedProTIP significantly outperforms state-of-the-art methods in average accuracy, particularly in settings where task identities are a priori unknown.

Paper Structure

This paper contains 46 sections, 19 equations, 6 figures, 15 tables, 1 algorithm.

Figures (6)

  • Figure 1: Overview of FedProTIP. (1) Clients apply projected gradient descent; the server aggregates updates. (2) Clients extract core bases via SVD; the server merges them into a global subspace. (3) At inference, task identity is predicted by comparing test relevance vectors to stored task references.
  • Figure 2: Average accuracy of class-incremental learning on three benchmarks. (a) Task-agnostic inference, where task identity is unknown. (b) Task-aware inference, where the true task ID is provided at test time.
  • Figure 3: Effect of projection threshold $\epsilon_l$ on FedProTIP accuracy.
  • Figure 4: Comparison with federated variants of task-agnostic inference methods. $\Delta$ values denote performance gains when combined with FCL methods.
  • Figure 5: GPU memory usage (GB) on a single NVIDIA H200 GPU. We report the maximum GPU memory allocated at each training phase.
  • ...and 1 more figures