Don't Reach for the Stars: Rethinking Topology for Resilient Federated Learning

Mirko Konstantin; Anirban Mukhopadhyay

Don't Reach for the Stars: Rethinking Topology for Resilient Federated Learning

Mirko Konstantin, Anirban Mukhopadhyay

TL;DR

The paper tackles the vulnerability of centralized star-topology FL to non-IID data and malfunctioning clients by proposing LIGHTYEAR, a decentralized P2P FL framework. Each client computes an agreement score on its private validation set to select a personalized aggregation subset from neighbors and updates its model via a regularized aggregation rule $\bar{\theta_i}^{(t+1)} = \bar{\theta_i}^{(t)} + \gamma^{t} \cdot \frac{1}{|\mathcal{S}_i|} \sum_{j \in \mathcal{S}_i} (\theta_j - \bar{\theta_i}^{(t)})$, with the aggregation guided by the agreement metric $A_{ij}$ that fuses accuracy, calibration, and confidence (or Dice for segmentation). The approach formalizes problem statements under non-exchangeable data, decomposes error into target-domain and corruption components, and demonstrates through five medical datasets that LIGHTYEAR delivers robust, personalized performance superior to both centralized baselines and existing P2P methods, including under adversarial and dynamic malfunction scenarios. The work highlights the practical impact of decoupling global coordination from local validation-driven aggregation to enhance resilience and personalization in federated learning. Overall, it argues for embracing decentralized architectures to improve reliability and domain-adaptive performance in real-world FL deployments.

Abstract

Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy by keeping data local. Traditional FL approaches rely on a centralized, star-shaped topology, where a central server aggregates model updates from clients. However, this architecture introduces several limitations, including a single point of failure, limited personalization, and poor robustness to distribution shifts or vulnerability to malfunctioning clients. Moreover, update selection in centralized FL often relies on low-level parameter differences, which can be unreliable when client data is not independent and identically distributed, and offer clients little control. In this work, we propose a decentralized, peer-to-peer (P2P) FL framework. It leverages the flexibility of the P2P topology to enable each client to identify and aggregate a personalized set of trustworthy and beneficial updates.This framework is the Local Inference Guided Aggregation for Heterogeneous Training Environments to Yield Enhancement Through Agreement and Regularization (LIGHTYEAR). Central to our method is an agreement score, computed on a local validation set, which quantifies the semantic alignment of incoming updates in the function space with respect to the clients reference model. Each client uses this score to select a tailored subset of updates and performs aggregation with a regularization term that further stabilizes the training. Our empirical evaluation across five datasets shows that the proposed approach consistently outperforms both, centralized baselines and existing P2P methods in terms of client-level performance, particularly under adversarial and heterogeneous conditions.

Don't Reach for the Stars: Rethinking Topology for Resilient Federated Learning

TL;DR

Abstract

Don't Reach for the Stars: Rethinking Topology for Resilient Federated Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)