Table of Contents
Fetching ...

Handling Spatial-Temporal Data Heterogeneity for Federated Continual Learning via Tail Anchor

Hao Yu, Xin Yang, Le Zhang, Hanlin Gu, Tianrui Li, Lixin Fan, Qiang Yang

TL;DR

This work tackles the challenge of spatial-temporal data heterogeneity in Federated Continual Learning (FCL), which induces both parameter-forgetting and output-forgetting as inputs shift across clients and tasks. It introduces Federated Tail Anchor (FedTA), a method that combines a frozen pre-trained Vision Transformer with learnable Tail Anchors, plus four components—Input Enhancement, Selective Input Knowledge Fusion, and Best Global Prototype Selection—to stabilize feature representations and class positions across time and space. Through extensive experiments on CIFAR-100 and ImageNet-R with multiple clients and tasks, FedTA achieves superior forgetting mitigation and preserves the relative geometry of features, outperforming baselines and showing favorable privacy and efficiency characteristics due to reduced replay reliance. The approach offers a scalable, privacy-conscious pathway for robust FCL in dynamic, heterogeneous environments, with practical implications for edge devices and real-world continual learning deployments.

Abstract

Federated continual learning (FCL) allows each client to continually update its knowledge from task streams, enhancing the applicability of federated learning in real-world scenarios. However, FCL needs to address not only spatial data heterogeneity between clients but also temporal data heterogeneity between tasks. In this paper, empirical experiments demonstrate that such input-level heterogeneity significantly affects the model's internal parameters and outputs, leading to severe spatial-temporal catastrophic forgetting of local and previous knowledge. To this end, we propose Federated Tail Anchor (FedTA) to mix trainable Tail Anchor with the frozen output features to adjust their position in the feature space, thereby overcoming parameter-forgetting and output-forgetting. Three novel components are also included: Input Enhancement for improving the performance of pre-trained models on downstream tasks; Selective Input Knowledge Fusion for fusion of heterogeneous local knowledge on the server; and Best Global Prototype Selection for finding the best anchor point for each class in the feature space. Extensive experiments demonstrate that FedTA not only outperforms existing FCL methods but also effectively preserves the relative positions of features.

Handling Spatial-Temporal Data Heterogeneity for Federated Continual Learning via Tail Anchor

TL;DR

This work tackles the challenge of spatial-temporal data heterogeneity in Federated Continual Learning (FCL), which induces both parameter-forgetting and output-forgetting as inputs shift across clients and tasks. It introduces Federated Tail Anchor (FedTA), a method that combines a frozen pre-trained Vision Transformer with learnable Tail Anchors, plus four components—Input Enhancement, Selective Input Knowledge Fusion, and Best Global Prototype Selection—to stabilize feature representations and class positions across time and space. Through extensive experiments on CIFAR-100 and ImageNet-R with multiple clients and tasks, FedTA achieves superior forgetting mitigation and preserves the relative geometry of features, outperforming baselines and showing favorable privacy and efficiency characteristics due to reduced replay reliance. The approach offers a scalable, privacy-conscious pathway for robust FCL in dynamic, heterogeneous environments, with practical implications for edge devices and real-world continual learning deployments.

Abstract

Federated continual learning (FCL) allows each client to continually update its knowledge from task streams, enhancing the applicability of federated learning in real-world scenarios. However, FCL needs to address not only spatial data heterogeneity between clients but also temporal data heterogeneity between tasks. In this paper, empirical experiments demonstrate that such input-level heterogeneity significantly affects the model's internal parameters and outputs, leading to severe spatial-temporal catastrophic forgetting of local and previous knowledge. To this end, we propose Federated Tail Anchor (FedTA) to mix trainable Tail Anchor with the frozen output features to adjust their position in the feature space, thereby overcoming parameter-forgetting and output-forgetting. Three novel components are also included: Input Enhancement for improving the performance of pre-trained models on downstream tasks; Selective Input Knowledge Fusion for fusion of heterogeneous local knowledge on the server; and Best Global Prototype Selection for finding the best anchor point for each class in the feature space. Extensive experiments demonstrate that FedTA not only outperforms existing FCL methods but also effectively preserves the relative positions of features.

Paper Structure

This paper contains 17 sections, 12 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Illustration of FCL, the negative impact of spatial-temporal data heterogeneity and the intuition of Tail Anchor.
  • Figure 2: Illustrations of the negative impact of spatial-temporal data heterogeneity on the feature extractor and feature space. [Left Side] illustrates the variation of significant features extracted by the feature extractor for the same input sample, where brighter colors indicate more important features. As spatial-temporal changes occur, the extracted features gradually shift away from "cat", even extracting features near the image edges. [Right Side] depicts the changes in the positions of the features in the feature space after undergoing spatial-temporal transformations for the same batch.
  • Figure 3: An overview of FedTA. Local training is a two-stage training process. The first stage involves adding input enhancement to the image embeddings to fully utilize ViT (see 1). In the second stage, the extracted features are fixed, and the corresponding tail anchor is mixed with them to adjust the similarity between classes by applying contrastive learning with global prototypes (see 2). Then, the local knowledge base of input enhancements and the local prototypes of each class are uploaded to the server, where selective input knowledge fusion for the knowledge base (see 4) and global best prototype selection for the local prototypes (see 3) are performed, respectively.
  • Figure 4: Knowledge retention on different dataset.
  • Figure 5: T-SNE for position changes of features corresponding to the same samples after FCL.