Table of Contents
Fetching ...

Personalized Federated Learning on Data with Dynamic Heterogeneity under Limited Storage

Sixing Tan, Xianmin Liu

TL;DR

This work tackles personalized federated learning under data with dynamic heterogeneity and storage limits, where traditional methods suffer catastrophic forgetting and poorly tuned collaboration. It introduces pFedGRP, a framework that combines a category-decomposed generative replay architecture with learnable, client-specific aggregation and local knowledge transfer to adapt to time-varying distributions. The approach uses per-category auxiliary submodels $A_{i,c}$, local data distribution reconstruction, and learnable server weights $\\mathbf{W}_i^t$ to produce personalized global models $C_{g,i}^t$ while maintaining a robust global model $C_g^t$. Extensive experiments on five datasets and multiple settings show that pFedGRP outperforms eight baselines, reduces forgetting, and remains robust under high heterogeneity, demonstrating practical impact for privacy-preserving, storage-constrained continual learning in distributed systems.

Abstract

Recently, a large number of data sources opened up by informatization intensify the data heterogeneity, the faster speed of data generation and the gradual implementation of data regulations limit the storage time of data. In personalized Federated Learning (pFL), clients train customized models to meet their personal objectives. However, due to the time-varying local data heterogeneity and the inaccessibility of previous data, existing pFL methods not only fail to solve the catastrophic forgetting of local models, but also difficult to estimate the degree of collaboration between clients. To address this issue, our core idea is a low consumption and high-quality generative replay architecture. Specifically, we decouple the generator by category to reduce the generation error of each category while mitigating catastrophic forgetting, use local model to improving the quality of generated data and reducing the update frequency of generator, and propose a local data reconstruction scheme to reduce data generation while adjusting the proportion of data categories. Based on above, we propose our pFL framework, pFedGRP, to achieve personalized aggregation and local knowledge transfer. Comprehensive experiments on five datasets with multiple settings show the superiority of pFedGRP over eight baseline methods.

Personalized Federated Learning on Data with Dynamic Heterogeneity under Limited Storage

TL;DR

This work tackles personalized federated learning under data with dynamic heterogeneity and storage limits, where traditional methods suffer catastrophic forgetting and poorly tuned collaboration. It introduces pFedGRP, a framework that combines a category-decomposed generative replay architecture with learnable, client-specific aggregation and local knowledge transfer to adapt to time-varying distributions. The approach uses per-category auxiliary submodels , local data distribution reconstruction, and learnable server weights to produce personalized global models while maintaining a robust global model . Extensive experiments on five datasets and multiple settings show that pFedGRP outperforms eight baselines, reduces forgetting, and remains robust under high heterogeneity, demonstrating practical impact for privacy-preserving, storage-constrained continual learning in distributed systems.

Abstract

Recently, a large number of data sources opened up by informatization intensify the data heterogeneity, the faster speed of data generation and the gradual implementation of data regulations limit the storage time of data. In personalized Federated Learning (pFL), clients train customized models to meet their personal objectives. However, due to the time-varying local data heterogeneity and the inaccessibility of previous data, existing pFL methods not only fail to solve the catastrophic forgetting of local models, but also difficult to estimate the degree of collaboration between clients. To address this issue, our core idea is a low consumption and high-quality generative replay architecture. Specifically, we decouple the generator by category to reduce the generation error of each category while mitigating catastrophic forgetting, use local model to improving the quality of generated data and reducing the update frequency of generator, and propose a local data reconstruction scheme to reduce data generation while adjusting the proportion of data categories. Based on above, we propose our pFL framework, pFedGRP, to achieve personalized aggregation and local knowledge transfer. Comprehensive experiments on five datasets with multiple settings show the superiority of pFedGRP over eight baseline methods.
Paper Structure (30 sections, 11 equations, 31 figures, 1 algorithm)

This paper contains 30 sections, 11 equations, 31 figures, 1 algorithm.

Figures (31)

  • Figure 1: Figure 1. The proportion of different types of the COVID-19 virus in various regions of Europe in January 2025. The data is sourced from https://gisaid.org/hcov19-variants/.
  • Figure 2: Figure 2. Local data distribution reconstruction scheme.
  • Figure 3: Figure 3. The flowchart of Local Training on client $\mathcal{C}_i$.
  • Figure 4: Figure 4. The flowchart of Global Aggregation on server.
  • Figure 5: Table 1. Results on FL with Tasks Gradually Changing.
  • ...and 26 more figures