Table of Contents
Fetching ...

PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning

Haiyang Guo, Fei Zhu, Wenzhuo Liu, Xu-Yao Zhang, Cheng-Lin Liu

TL;DR

PILoRA tackles Federated Class Incremental Learning by combining prototype-guided feature representations with Incremental LoRA on a frozen Vision Transformer backbone to address forgetting and classifier bias under non-IID, privacy-preserving settings. The Incremental LoRA mechanism sums orthogonal LoRA components across stages to preserve past knowledge without retraining the classifier, while a prototype re-weight module aggregates local prototypes into robust global class prototypes, guided by heuristic distances to per-client features. The integrated loss $l_{total}=l_{dce}+\lambda l_{pl}+\gamma l_{ort}$ jointly optimizes discriminative features, compact intra-class prototypes, and orthogonal task directions, yielding strong, robust performance on CIFAR‑100, TinyImageNet, and ImageNet‑200 across diverse non‑IID conditions. The approach achieves state-of-the-art results with minimal memory and communication overhead and demonstrates scalable, privacy-conscious FCIL performance suitable for real-world deployment.

Abstract

Existing federated learning methods have effectively dealt with decentralized learning in scenarios involving data privacy and non-IID data. However, in real-world situations, each client dynamically learns new classes, requiring the global model to classify all seen classes. To effectively mitigate catastrophic forgetting and data heterogeneity under low communication costs, we propose a simple and effective method named PILoRA. On the one hand, we adopt prototype learning to learn better feature representations and leverage the heuristic information between prototypes and class features to design a prototype re-weight module to solve the classifier bias caused by data heterogeneity without retraining the classifier. On the other hand, we view incremental learning as the process of learning distinct task vectors and encoding them within different LoRA parameters. Accordingly, we propose Incremental LoRA to mitigate catastrophic forgetting. Experimental results on standard datasets indicate that our method outperforms the state-of-the-art approaches significantly. More importantly, our method exhibits strong robustness and superiority in different settings and degrees of data heterogeneity. The code is available at \url{https://github.com/Ghy0501/PILoRA}.

PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental Learning

TL;DR

PILoRA tackles Federated Class Incremental Learning by combining prototype-guided feature representations with Incremental LoRA on a frozen Vision Transformer backbone to address forgetting and classifier bias under non-IID, privacy-preserving settings. The Incremental LoRA mechanism sums orthogonal LoRA components across stages to preserve past knowledge without retraining the classifier, while a prototype re-weight module aggregates local prototypes into robust global class prototypes, guided by heuristic distances to per-client features. The integrated loss jointly optimizes discriminative features, compact intra-class prototypes, and orthogonal task directions, yielding strong, robust performance on CIFAR‑100, TinyImageNet, and ImageNet‑200 across diverse non‑IID conditions. The approach achieves state-of-the-art results with minimal memory and communication overhead and demonstrates scalable, privacy-conscious FCIL performance suitable for real-world deployment.

Abstract

Existing federated learning methods have effectively dealt with decentralized learning in scenarios involving data privacy and non-IID data. However, in real-world situations, each client dynamically learns new classes, requiring the global model to classify all seen classes. To effectively mitigate catastrophic forgetting and data heterogeneity under low communication costs, we propose a simple and effective method named PILoRA. On the one hand, we adopt prototype learning to learn better feature representations and leverage the heuristic information between prototypes and class features to design a prototype re-weight module to solve the classifier bias caused by data heterogeneity without retraining the classifier. On the other hand, we view incremental learning as the process of learning distinct task vectors and encoding them within different LoRA parameters. Accordingly, we propose Incremental LoRA to mitigate catastrophic forgetting. Experimental results on standard datasets indicate that our method outperforms the state-of-the-art approaches significantly. More importantly, our method exhibits strong robustness and superiority in different settings and degrees of data heterogeneity. The code is available at \url{https://github.com/Ghy0501/PILoRA}.
Paper Structure (18 sections, 15 equations, 7 figures, 4 tables, 1 algorithm)

This paper contains 18 sections, 15 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: The local data of client 1 and client 2 are non-IID. Resnet50 (the first row) focuses more on local patterns, and the patterns learned are significantly different in the case of data heterogeneity, causing the average model to lose some important information (e.g., fish fins). However, ViT (the second row) is less affected by data heterogeneity, and the averaged model basically retains all the information learned by the local model.
  • Figure 2: Illustration of PILoRA for FCIL. The client fine-tunes LoRA and prototypes for each class using a local dataset. Upon upload, the global server aggregates the LoRA uploaded by different clients, and applies a prototype re-weight module before re-sending them to each client.
  • Figure 3: (a) Comparison of real proportions of the same class among different clients and the weights obtained from Prototype Re-weighting during a local training process. (b) The $L2$ distance between classes feature extracted by different clients and global model with corresponding prototypes.
  • Figure 4: The impact of different number of $K$ on CIFAR-100, where we consider $\alpha=6$ and $\beta=0.5$.
  • Figure 5: An example of distribution-based label imbalance partition and quantity-based label imbalance partition on CIFAR-100 (10 classes) with $\beta = 0.5$ (left) and $\alpha=6$ (right).
  • ...and 2 more figures