Sheaf HyperNetworks for Personalized Federated Learning
Bao Nguyen, Lorenzo Sani, Xinchi Qiu, Pietro Liò, Nicholas D. Lane
TL;DR
This work tackles personalized federated learning (PFL) under data heterogeneity by addressing Graph HyperNetwork (GHN) limitations such as over-smoothing and heterophily. It introduces Sheaf HyperNetworks (SHNs), which fuse cellular sheaf diffusion with hypernetworks to enable richer cross-client parameter sharing and adds a privacy-preserving method to construct client relation graphs from learned embeddings. Across multi-class classification, traffic, and weather forecasting, SHNs consistently outperform baselines, including pFedHN, Panacea, and GHN, with improvements up to $2.7\%$ in accuracy and $5.3\%$ in MSE in challenging non-IID settings. The three-stage training pipeline—Federated HyperNetwork Training, Client Relation Graph Construction, and Federated Sheaf HyperNetwork Training—together with the sheaf diffusion mechanism, provides a robust framework for scalable, expressive, and privacy-preserving PFL in diverse domains.
Abstract
Graph hypernetworks (GHNs), constructed by combining graph neural networks (GNNs) with hypernetworks (HNs), leverage relational data across various domains such as neural architecture search, molecular property prediction and federated learning. Despite GNNs and HNs being individually successful, we show that GHNs present problems compromising their performance, such as over-smoothing and heterophily. Moreover, we cannot apply GHNs directly to personalized federated learning (PFL) scenarios, where a priori client relation graph may be absent, private, or inaccessible. To mitigate these limitations in the context of PFL, we propose a novel class of HNs, sheaf hypernetworks (SHNs), which combine cellular sheaf theory with HNs to improve parameter sharing for PFL. We thoroughly evaluate SHNs across diverse PFL tasks, including multi-class classification, traffic and weather forecasting. Additionally, we provide a methodology for constructing client relation graphs in scenarios where such graphs are unavailable. We show that SHNs consistently outperform existing PFL solutions in complex non-IID scenarios. While the baselines' performance fluctuates depending on the task, SHNs show improvements of up to 2.7% in accuracy and 5.3% in lower mean squared error over the best-performing baseline.
