Graph Theory Meets Federated Learning over Satellite Constellations: Spanning Aggregations, Network Formation, and Performance Optimization
Fardis Nadimi, Payam Abdisarabshali, Jacob Chakareski, Nicholas Mastronarde, Seyyedali Hosseinalipour
TL;DR
Fed-Span advances federated learning for satellite constellations by replacing a single central aggregator with spanning tree based topologies formed over inter-satellite laser links, enabling over-space aggregations that adapt to dynamic satellite networks. It develops continuous constraint representations to model MSTs and MoDSTs/MoDSFs, derives convergence bounds for non-convex loss under time-varying data and idle times, and casts the joint topology-and-resource optimization as a signomial program that is solved via a geometric programming approach with guarantees. The framework yields faster convergence and lower energy/latency in simulations across multiple datasets and constellations, while offering flexible VC-based clustering to balance ML performance and resource use. This work thus provides a practical, optimization-driven blueprint for energy- and latency-aware, ground-free federated learning in space.
Abstract
In this work, we introduce Fed-Span: \textit{\underline{fed}erated learning with \underline{span}ning aggregation over low Earth orbit (LEO) satellite constellations}. Fed-Span aims to address critical challenges inherent to distributed learning in dynamic satellite networks, including intermittent satellite connectivity, heterogeneous computational capabilities of satellites, and time-varying satellites' datasets. At its core, Fed-Span leverages minimum spanning tree (MST) and minimum spanning forest (MSF) topologies to introduce spanning model aggregation and dispatching processes for distributed learning. To formalize Fed-Span, we offer a fresh perspective on MST/MSF topologies by formulating them through a set of continuous constraint representations (CCRs), thereby integrating these topologies into a distributed learning framework for satellite networks. Using these CCRs, we obtain the energy consumption and latency of operations in Fed-Span. Moreover, we derive novel convergence bounds for Fed-Span, accommodating its key system characteristics and degrees of freedom (i.e., tunable parameters). Finally, we propose a comprehensive optimization problem that jointly minimizes model prediction loss, energy consumption, and latency of {Fed-Span}. We unveil that this problem is NP-hard and develop a systematic approach to transform it into a geometric programming formulation, solved via successive convex optimization with performance guarantees. Through evaluations on real-world datasets, we demonstrate that Fed-Span outperforms existing methods, with faster model convergence, greater energy efficiency, and reduced latency.
