Table of Contents
Fetching ...

PeFLL: Personalized Federated Learning by Learning to Learn

Jonathan Scott, Hossein Zakerinia, Christoph H. Lampert

TL;DR

PeFLL tackles heterogeneity in federated learning by learning a descriptor-to-model mapping via an embedding network and a server-side hypernetwork, producing ready-to-use personalized models for any client with a single forward pass. By decomposing computation between server and clients, it reduces client-side training and communication while maintaining strong generalization to unseen clients, supported by a convergence bound and a PAC-Bayesian generalization guarantee. Empirically, PeFLL achieves state-of-the-art or competitive accuracy on standard pFL benchmarks, with notable robustness in low-data and unseen-client scenarios, and it demonstrates meaningful alignment between learned client descriptors and true distribution similarity. The work also shows that descriptor-based organization of client space supports generalization and that inference for new clients requires minimal communication and computation.

Abstract

We present PeFLL, a new personalized federated learning algorithm that improves over the state-of-the-art in three aspects: 1) it produces more accurate models, especially in the low-data regime, and not only for clients present during its training phase, but also for any that may emerge in the future; 2) it reduces the amount of on-client computation and client-server communication by providing future clients with ready-to-use personalized models that require no additional finetuning or optimization; 3) it comes with theoretical guarantees that establish generalization from the observed clients to future ones. At the core of PeFLL lies a learning-to-learn approach that jointly trains an embedding network and a hypernetwork. The embedding network is used to represent clients in a latent descriptor space in a way that reflects their similarity to each other. The hypernetwork takes as input such descriptors and outputs the parameters of fully personalized client models. In combination, both networks constitute a learning algorithm that achieves state-of-the-art performance in several personalized federated learning benchmarks.

PeFLL: Personalized Federated Learning by Learning to Learn

TL;DR

PeFLL tackles heterogeneity in federated learning by learning a descriptor-to-model mapping via an embedding network and a server-side hypernetwork, producing ready-to-use personalized models for any client with a single forward pass. By decomposing computation between server and clients, it reduces client-side training and communication while maintaining strong generalization to unseen clients, supported by a convergence bound and a PAC-Bayesian generalization guarantee. Empirically, PeFLL achieves state-of-the-art or competitive accuracy on standard pFL benchmarks, with notable robustness in low-data and unseen-client scenarios, and it demonstrates meaningful alignment between learned client descriptors and true distribution similarity. The work also shows that descriptor-based organization of client space supports generalization and that inference for new clients requires minimal communication and computation.

Abstract

We present PeFLL, a new personalized federated learning algorithm that improves over the state-of-the-art in three aspects: 1) it produces more accurate models, especially in the low-data regime, and not only for clients present during its training phase, but also for any that may emerge in the future; 2) it reduces the amount of on-client computation and client-server communication by providing future clients with ready-to-use personalized models that require no additional finetuning or optimization; 3) it comes with theoretical guarantees that establish generalization from the observed clients to future ones. At the core of PeFLL lies a learning-to-learn approach that jointly trains an embedding network and a hypernetwork. The embedding network is used to represent clients in a latent descriptor space in a way that reflects their similarity to each other. The hypernetwork takes as input such descriptors and outputs the parameters of fully personalized client models. In combination, both networks constitute a learning algorithm that achieves state-of-the-art performance in several personalized federated learning benchmarks.
Paper Structure (43 sections, 8 theorems, 43 equations, 19 figures, 9 tables, 2 algorithms)

This paper contains 43 sections, 8 theorems, 43 equations, 19 figures, 9 tables, 2 algorithms.

Key Result

Theorem 3.1

Under standard smoothness and boundedness assumptions (see appendix), PeFLL's optimization after $T$ steps fulfills where $F$ is the PeFLL objective eq:objective with lower bound $F_{*}$. $\eta_0$ are the parameter values at initialization, $\eta_1,\dots,\eta_T$ are the intermediate parameter values. $L, L_1$ are smoothness parameters of $F$ and the local models. $b_1,b_2$ are bounds on the norms

Figures (19)

  • Figure 1: Communication protocol of PeFLL-predict for generating personalized models.
  • Figure 2: Data flow for PeFLL model generation (forward pass, left) and training (backward pass, right). The client descriptor, $v_i$, and the client model $\theta_i$ are small. Transmitting them and their update vectors is efficient. The hypernetwork, $\eta_h$, can be large, but it remains on the server.
  • Figure 3: Correlation between client descriptor similarity obtained from the embedding network and ground truth similarity over the course of training.
  • Figure 4: Accuracy in client extrapolation. Larger values of $\alpha$ indicate new clients that are more dissimilar to the train clients.
  • Figure 5: Test Accuracies for train clients (top row) and test clients (bottom row) during different steps of the training for PeFLL and baselines, for the Shakespeare dataset.
  • ...and 14 more figures

Theorems & Definitions (15)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem B.1
  • proof
  • Lemma C.1
  • proof
  • Lemma C.2
  • proof
  • Lemma C.3
  • proof
  • ...and 5 more