PeFLL: Personalized Federated Learning by Learning to Learn
Jonathan Scott, Hossein Zakerinia, Christoph H. Lampert
TL;DR
PeFLL tackles heterogeneity in federated learning by learning a descriptor-to-model mapping via an embedding network and a server-side hypernetwork, producing ready-to-use personalized models for any client with a single forward pass. By decomposing computation between server and clients, it reduces client-side training and communication while maintaining strong generalization to unseen clients, supported by a convergence bound and a PAC-Bayesian generalization guarantee. Empirically, PeFLL achieves state-of-the-art or competitive accuracy on standard pFL benchmarks, with notable robustness in low-data and unseen-client scenarios, and it demonstrates meaningful alignment between learned client descriptors and true distribution similarity. The work also shows that descriptor-based organization of client space supports generalization and that inference for new clients requires minimal communication and computation.
Abstract
We present PeFLL, a new personalized federated learning algorithm that improves over the state-of-the-art in three aspects: 1) it produces more accurate models, especially in the low-data regime, and not only for clients present during its training phase, but also for any that may emerge in the future; 2) it reduces the amount of on-client computation and client-server communication by providing future clients with ready-to-use personalized models that require no additional finetuning or optimization; 3) it comes with theoretical guarantees that establish generalization from the observed clients to future ones. At the core of PeFLL lies a learning-to-learn approach that jointly trains an embedding network and a hypernetwork. The embedding network is used to represent clients in a latent descriptor space in a way that reflects their similarity to each other. The hypernetwork takes as input such descriptors and outputs the parameters of fully personalized client models. In combination, both networks constitute a learning algorithm that achieves state-of-the-art performance in several personalized federated learning benchmarks.
