Table of Contents
Fetching ...

Aggregation on Learnable Manifolds for Asynchronous Federated Optimization

Archie Licudi, Anshul Thakur, Soheila Molaei, Danielle Belgrave, David Clifton

TL;DR

This work tackles asynchronous federated optimization under heterogeneous client data by reframing aggregation as curve learning on a Riemannian manifold. It introduces AsyncManifold, with AsyncBezier learning Bezier aggregation paths and OrthoDC correcting stale, conflicting updates, and proves convergence under mild assumptions. Empirical results on FEMNIST, LEAF Shakespeare, and CXR8 show improved accuracy and client fairness, even when some clients have higher local compute budgets. The approach promises robust, geometry-informed collaboration in privacy-preserving, real-world deployments, including healthcare settings.

Abstract

Asynchronous federated learning (FL) with heterogeneous clients faces two key issues: curvature-induced loss barriers encountered by standard linear parameter interpolation techniques (e.g. FedAvg) and interference from stale updates misaligned with the server's current optimisation state. To alleviate these issues, we introduce a geometric framework that casts aggregation as curve learning in a Riemannian model space and decouples trajectory selection from update conflict resolution. Within this, we propose AsyncBezier, which replaces linear aggregation with low-degree polynomial (Bezier) trajectories to bypass loss barriers, and OrthoDC, which projects delayed updates via inner product-based orthogonality to reduce interference. We establish framework-level convergence guarantees covering each variant given simple assumptions on their components. On three datasets spanning general-purpose and healthcare domains, including LEAF Shakespeare and FEMNIST, our approach consistently improves accuracy and client fairness over strong asynchronous baselines; finally, we show that these gains are preserved even when other methods are allocated a higher local compute budget.

Aggregation on Learnable Manifolds for Asynchronous Federated Optimization

TL;DR

This work tackles asynchronous federated optimization under heterogeneous client data by reframing aggregation as curve learning on a Riemannian manifold. It introduces AsyncManifold, with AsyncBezier learning Bezier aggregation paths and OrthoDC correcting stale, conflicting updates, and proves convergence under mild assumptions. Empirical results on FEMNIST, LEAF Shakespeare, and CXR8 show improved accuracy and client fairness, even when some clients have higher local compute budgets. The approach promises robust, geometry-informed collaboration in privacy-preserving, real-world deployments, including healthcare settings.

Abstract

Asynchronous federated learning (FL) with heterogeneous clients faces two key issues: curvature-induced loss barriers encountered by standard linear parameter interpolation techniques (e.g. FedAvg) and interference from stale updates misaligned with the server's current optimisation state. To alleviate these issues, we introduce a geometric framework that casts aggregation as curve learning in a Riemannian model space and decouples trajectory selection from update conflict resolution. Within this, we propose AsyncBezier, which replaces linear aggregation with low-degree polynomial (Bezier) trajectories to bypass loss barriers, and OrthoDC, which projects delayed updates via inner product-based orthogonality to reduce interference. We establish framework-level convergence guarantees covering each variant given simple assumptions on their components. On three datasets spanning general-purpose and healthcare domains, including LEAF Shakespeare and FEMNIST, our approach consistently improves accuracy and client fairness over strong asynchronous baselines; finally, we show that these gains are preserved even when other methods are allocated a higher local compute budget.

Paper Structure

This paper contains 24 sections, 2 theorems, 45 equations, 5 figures, 3 tables.

Key Result

Theorem 1

The AsyncManifold algorithm, with no SWA, assumptions as above, and the local learning rate $\eta_l = \mathcal{O}(1/\max\{2C_1,\sqrt{T}\})$, converges with: Where $C_1,C_2,C_3$ are constants as defined in the proof.

Figures (5)

  • Figure 1: Quadratic Bezier mode connections learned during the federated training of LeNet-5, projected onto a 2-d loss landscape. Plot (a) shows cross-entropy loss w.r.t. a local training set and (b) w.r.t. the global test set.
  • Figure 2: Illustration of our approach to manifold learning. $\mathcal{M} = D_1(\mathbb{R}^2)$ maps into parameter space $\mathcal{M}_\Theta = \mathbb{R}^3$ by the learned embedding. $\iota_\phi(\mathcal{M})$ inherits a Riemannian structure from $\mathcal{M}_\Theta$ via the subspace metric, distorted by the to the loss-minimising nature of $\iota_\phi$, which is in turn isometric to a retraction of $\mathcal{M}$ equipped with the pullback metric (in this illustration, the retraction $\rho = id$). The curvature of this $\mathcal{M}_\phi$ space thus induces a lower-loss curved path in $\mathcal{M}$, and hence $\mathcal{M}_\Theta$ under the embedding.
  • Figure 3: Bar plots of (unweighted) Gini Coefficient and Theil Index computed for each method over the model performance on each client's validation set.
  • Figure 4: Accuracy of each method on FEMNIST after 360 communication rounds by local epoch count
  • Figure 5: CNN Architecture for FEMNIST

Theorems & Definitions (8)

  • Definition 1: Riemannian Gradient
  • Definition 2: Exponential Map
  • Definition 3: Metric-Preserving Transport
  • Remark
  • Theorem 1: Convergence of AsyncManifold
  • proof
  • Theorem 1: Convergence of AsyncManifold
  • proof : Proof of Theorem 1