Table of Contents
Fetching ...

Efficient Federated Learning against Heterogeneous and Non-stationary Client Unavailability

Ming Xiang, Stratis Ioannidis, Edmund Yeh, Carlee Joe-Wong, Lili Su

TL;DR

FedAPM is proposed, which includes novel algorithmic structures that compensate for missed computations due to unavailability with only $O(1)$ additional memory and computation with respect to standard FedAvg, and shows that FedAPM converges to a stationary point of even non-convex objectives while achieving the desired linear speedup property.

Abstract

Addressing intermittent client availability is critical for the real-world deployment of federated learning algorithms. Most prior work either overlooks the potential non-stationarity in the dynamics of client unavailability or requires substantial memory/computation overhead. We study federated learning in the presence of heterogeneous and non-stationary client availability, which may occur when the deployment environments are uncertain, or the clients are mobile. The impacts of heterogeneity and non-stationarity on client unavailability can be significant, as we illustrate using FedAvg, the most widely adopted federated learning algorithm. We propose FedAPM, which includes novel algorithmic structures that (i) compensate for missed computations due to unavailability with only $O(1)$ additional memory and computation with respect to standard FedAvg, and (ii) evenly diffuse local updates within the federated learning system through implicit gossiping, despite being agnostic to non-stationary dynamics. We show that FedAPM converges to a stationary point of even non-convex objectives while achieving the desired linear speedup property. We corroborate our analysis with numerical experiments over diversified client unavailability dynamics on real-world data sets.

Efficient Federated Learning against Heterogeneous and Non-stationary Client Unavailability

TL;DR

FedAPM is proposed, which includes novel algorithmic structures that compensate for missed computations due to unavailability with only additional memory and computation with respect to standard FedAvg, and shows that FedAPM converges to a stationary point of even non-convex objectives while achieving the desired linear speedup property.

Abstract

Addressing intermittent client availability is critical for the real-world deployment of federated learning algorithms. Most prior work either overlooks the potential non-stationarity in the dynamics of client unavailability or requires substantial memory/computation overhead. We study federated learning in the presence of heterogeneous and non-stationary client availability, which may occur when the deployment environments are uncertain, or the clients are mobile. The impacts of heterogeneity and non-stationarity on client unavailability can be significant, as we illustrate using FedAvg, the most widely adopted federated learning algorithm. We propose FedAPM, which includes novel algorithmic structures that (i) compensate for missed computations due to unavailability with only additional memory and computation with respect to standard FedAvg, and (ii) evenly diffuse local updates within the federated learning system through implicit gossiping, despite being agnostic to non-stationary dynamics. We show that FedAPM converges to a stationary point of even non-convex objectives while achieving the desired linear speedup property. We corroborate our analysis with numerical experiments over diversified client unavailability dynamics on real-world data sets.
Paper Structure (54 sections, 14 theorems, 110 equations, 8 figures, 10 tables, 1 algorithm)

This paper contains 54 sections, 14 theorems, 110 equations, 8 figures, 10 tables, 1 algorithm.

Key Result

Proposition 1

If ${\mathds{1}_{\left\{{i \in {\mathcal{A}}^{R-1}}\right\}}} = 1$, it holds that $\sum_{t=0}^{R-1} {\mathds{1}_{\left\{{i \in {\mathcal{A}}^t}\right\}}} \left( t - \tau_i(t) \right) = R,~\forall~ R \ge 1.$

Figures (8)

  • Figure 1: Client $i$'s available probabilities $p_i^t$'s are heterogeneous and are subject to non-stationary dynamics.
  • Figure 2: Let $x_{\text{output}}\triangleq \lim_{t\to\infty} \mathbb{E}\left[ x^t \right]$. Under most of the choices of $p_1, p_2$, $x_{\text{output}}$ is far from $x^*$.
  • Figure 3: Train and test accuracy results in percentage (%). In particular, the parameter $\gamma$ signifies the degree of non-stationary. Notice that, as the client availability becomes more non-stationary (a larger $\gamma$), FedAvg experiences a significant drop in accuracy. For example, both the train and test accuracies drop by over $10\%$ when $p=0.1$, and $\gamma$ increases from $0.1$ to $0.5$.
  • Figure 4: An example of data heterogeneity using $\mathsf{Dirichlet}(\alpha=0.1)$ distribution with $20$ clients. $x$-axis denotes the categories of images, while $y$-axis denotes the client index. The size of a circle refers to the proportion of pictures in a given class. The color of a circle distinguishes images with different categories.
  • Figure 5: A histogram of one generated $p_i$'s example with a total of $m = 100$ clients. It can be seen that the majority of $p_i$'s are below $0.5$.
  • ...and 3 more figures

Theorems & Definitions (29)

  • Example 1: Heterogeneity
  • Proposition 1
  • Lemma 1: nedic2017achievingnedic2018networkwang2021cooperative
  • Definition 1
  • Lemma 2: Unavailability statistics
  • Lemma 3: Descent Lemma
  • Proposition 2: Approximation error
  • Lemma 4: xiang2023towards
  • Theorem 1: Convergence error of ${\bm z}_i^t$
  • Corollary 1: Convergence rate of ${\bm x}_i^t$
  • ...and 19 more