Table of Contents
Fetching ...

SGFusion: Stochastic Geographic Gradient Fusion in Federated Learning

Khoa Nguyen, Khang Tran, NhatHai Phan, Cristian Borcea, Ruoming Jin, Issa Khalil

TL;DR

SGFusion tackles non-IID, geography-driven federated learning by modeling inter-zone data correlations with a hierarchical random graph (HRG) and optimizing it via Markov Chain Monte Carlo (MCMC). Each zone fuses gradients from a small, stochastically sampled set of other zones, guided by self-attention weights that reflect similarity, and DP-preserving zone histograms are used to construct the HRG. Theoretical guarantees show a convergence rate of $O({}{T})$ under standard convexity and Lipschitz conditions, with complexity dominated by one-time HRG construction and linear per-round updates across zones. Empirically, SGFusion yields significant zone- and country-level utility improvements on a heart-rate prediction dataset collected across six countries, while maintaining scalable computation and communication costs. The approach offers a practical path to scalable, personalized FL in mobile sensing with strong privacy considerations.

Abstract

This paper proposes Stochastic Geographic Gradient Fusion (SGFusion), a novel training algorithm to leverage the geographic information of mobile users in Federated Learning (FL). SGFusion maps the data collected by mobile devices onto geographical zones and trains one FL model per zone, which adapts well to the data and behaviors of users in that zone. SGFusion models the local data-based correlation among geographical zones as a hierarchical random graph (HRG) optimized by Markov Chain Monte Carlo sampling. At each training step, every zone fuses its local gradient with gradients derived from a small set of other zones sampled from the HRG. This approach enables knowledge fusion and sharing among geographical zones in a probabilistic and stochastic gradient fusion process with self-attention weights, such that "more similar" zones have "higher probabilities" of sharing gradients with "larger attention weights." SGFusion remarkably improves model utility without introducing undue computational cost. Extensive theoretical and empirical results using a heart-rate prediction dataset collected across 6 countries show that models trained with SGFusion converge with upper-bounded expected errors and significantly improve utility in all countries compared to existing approaches without notable cost in system scalability.

SGFusion: Stochastic Geographic Gradient Fusion in Federated Learning

TL;DR

SGFusion tackles non-IID, geography-driven federated learning by modeling inter-zone data correlations with a hierarchical random graph (HRG) and optimizing it via Markov Chain Monte Carlo (MCMC). Each zone fuses gradients from a small, stochastically sampled set of other zones, guided by self-attention weights that reflect similarity, and DP-preserving zone histograms are used to construct the HRG. Theoretical guarantees show a convergence rate of under standard convexity and Lipschitz conditions, with complexity dominated by one-time HRG construction and linear per-round updates across zones. Empirically, SGFusion yields significant zone- and country-level utility improvements on a heart-rate prediction dataset collected across six countries, while maintaining scalable computation and communication costs. The approach offers a practical path to scalable, personalized FL in mobile sensing with strong privacy considerations.

Abstract

This paper proposes Stochastic Geographic Gradient Fusion (SGFusion), a novel training algorithm to leverage the geographic information of mobile users in Federated Learning (FL). SGFusion maps the data collected by mobile devices onto geographical zones and trains one FL model per zone, which adapts well to the data and behaviors of users in that zone. SGFusion models the local data-based correlation among geographical zones as a hierarchical random graph (HRG) optimized by Markov Chain Monte Carlo sampling. At each training step, every zone fuses its local gradient with gradients derived from a small set of other zones sampled from the HRG. This approach enables knowledge fusion and sharing among geographical zones in a probabilistic and stochastic gradient fusion process with self-attention weights, such that "more similar" zones have "higher probabilities" of sharing gradients with "larger attention weights." SGFusion remarkably improves model utility without introducing undue computational cost. Extensive theoretical and empirical results using a heart-rate prediction dataset collected across 6 countries show that models trained with SGFusion converge with upper-bounded expected errors and significantly improve utility in all countries compared to existing approaches without notable cost in system scalability.

Paper Structure

This paper contains 14 sections, 1 theorem, 27 equations, 10 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Let $\theta^T_z$ be the output of Alg. alg: Training. If learning rate $\eta_t = \frac{1}{\mu t}$ and Assumption assmpt:strong-convex - assmpt:bddeviation are satisfied, then the excessive risk $\mathbb{E}[F_z(\theta^T_z)] - F_z(\theta^*_z)$ is bounded by: where $\bar{G} = G^2[1 + 2\sum_{z' \neq z}p_{z,z'} + \sum_{z' \neq z} p_{z',z}(1-p_{z,z'}) + (\sum_{z' \neq z}p_{z,z'})^2]$, $p_{z,z'}$ is the

Figures (10)

  • Figure 1: SGFusion with geographical zones.
  • Figure 2: Dendrogram $\mathcal{T}$ with 16 zones in Poland (as shown in Figure \ref{['fig:sgf-geofl']}) using Euclidean distance.
  • Figure 3: Given a current state of an internal node $r$, there are only two possible candidate states, which are the result of $\alpha$-transition and $\beta$-transition on the node $r$.
  • Figure 4: Probabilistic dendrogram of zone "West Pomeranian" in Poland derived from Figure \ref{['fig:hrg-16z-poland']}.
  • Figure 5: Data label distribution $\mathcal{Y}_u$ of a particular user $u$.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Theorem 1
  • proof