Table of Contents
Fetching ...

Personalized Federated Learning via Gaussian Generative Modeling

Peng Hu, Jianwei Ma

Abstract

Federated learning has emerged as a paradigm to train models collaboratively on inherently distributed client data while safeguarding privacy. In this context, personalized federated learning tackles the challenge of data heterogeneity by equipping each client with a dedicated model. A prevalent strategy decouples the model into a shared feature extractor and a personalized classifier head, where the latter actively guides the representation learning. However, previous works have focused on classifier head-guided personalization, neglecting the potential personalized characteristics in the representation distribution. Building on this insight, we propose pFedGM, a method based on Gaussian generative modeling. The approach begins by training a Gaussian generator that models client heterogeneity via weighted re-sampling. A balance between global collaboration and personalization is then struck by employing a dual objective: a shared objective that maximizes inter-class distance across clients, and a local objective that minimizes intra-class distance within them. To achieve this, we decouple the conventional Gaussian classifier into a navigator for global optimization, and a statistic extractor for capturing distributional statistics. Inspired by the Kalman gain, the algorithm then employs a dual-scale fusion framework at global and local levels to equip each client with a personalized classifier head. In this framework, we model the global representation distribution as a prior and the client-specific data as the likelihood, enabling Bayesian inference for class probability estimation. The evaluation covers a comprehensive range of scenarios: heterogeneity in class counts, environmental corruption, and multiple benchmark datasets and configurations. pFedGM achieves superior or competitive performance compared to state-of-the-art methods.

Personalized Federated Learning via Gaussian Generative Modeling

Abstract

Federated learning has emerged as a paradigm to train models collaboratively on inherently distributed client data while safeguarding privacy. In this context, personalized federated learning tackles the challenge of data heterogeneity by equipping each client with a dedicated model. A prevalent strategy decouples the model into a shared feature extractor and a personalized classifier head, where the latter actively guides the representation learning. However, previous works have focused on classifier head-guided personalization, neglecting the potential personalized characteristics in the representation distribution. Building on this insight, we propose pFedGM, a method based on Gaussian generative modeling. The approach begins by training a Gaussian generator that models client heterogeneity via weighted re-sampling. A balance between global collaboration and personalization is then struck by employing a dual objective: a shared objective that maximizes inter-class distance across clients, and a local objective that minimizes intra-class distance within them. To achieve this, we decouple the conventional Gaussian classifier into a navigator for global optimization, and a statistic extractor for capturing distributional statistics. Inspired by the Kalman gain, the algorithm then employs a dual-scale fusion framework at global and local levels to equip each client with a personalized classifier head. In this framework, we model the global representation distribution as a prior and the client-specific data as the likelihood, enabling Bayesian inference for class probability estimation. The evaluation covers a comprehensive range of scenarios: heterogeneity in class counts, environmental corruption, and multiple benchmark datasets and configurations. pFedGM achieves superior or competitive performance compared to state-of-the-art methods.
Paper Structure (28 sections, 2 theorems, 30 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 28 sections, 2 theorems, 30 equations, 10 figures, 7 tables, 1 algorithm.

Key Result

Theorem 4.1

Suppose at a certain step of a stochastic optimization algorithm, a client $c$ is selected. During client training, random sampling of local data yields samples $x_1, x_2, \dots, x_n$ and $\bm{z}_i = \bm{z}(x_i; \, \bm{\phi})$. Then,

Figures (10)

  • Figure 1: t-SNE visualization of representation distribution divergence caused by client data heterogeneity. (a) Representations of two classes from two clients; (b) Colored by class; (c, d) Colored by client. Subfigure (c) displays the feature representations of class 1, while (d) displays those of class 2. Different clients exhibit distinct cluster means and covariance structures.
  • Figure 2: Overview of the generator training. (Left) Data from different clients of the same class exhibit heterogeneity, and consequently, their distributions diverge in the representation space. (Middle) The shared objective drives features of different classes to diverge along distinct directions, whereas (right) the client personalized objective prompts features of the same class to aggregate around client-specific centers.
  • Figure 3: Global-local collaborative training module. In this process, the shared and local objectives jointly optimize the generator. The navigator defines and is refined by the shared objective, while a statistics extractor captures global statistics.
  • Figure 4: Illustration of parameter decoupling. After decoupling the conventional Gaussian classifier into a navigator and covariance extractor (Covariance for short), these along with the generator are exchanged between server and clients. The navigator is utilized to generate the shared objective, and the covariance features are employed to craft the subsequent personalized classifier heads.
  • Figure 5: Navigational direction and local class prototype adaptation. (Left) The navigational direction ($\bm{\mu}_i$) self-adjusts based on the global representation distribution, with the objective of aligning with its own class centroid and diverging from others. (Right) The local class prototype ($\bm{\upsilon}_{i, k}$) continuously adapts toward the class mean.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Theorem 4.1
  • Theorem 4.2