Table of Contents
Fetching ...

Communication-Efficient Device Scheduling for Federated Learning Using Lyapunov Optimization

Jake B. Perazzone, Shiqiang Wang, Mingyue Ji, Kevin Chan

TL;DR

The paper addresses federated learning over constrained wireless networks where device participation is intermittent and data across devices is non-iid. It derives a convergence bound for nonconvex loss under arbitrary participation probabilities and designs a Lyapunov drift-plus-penalty online policy that jointly schedules devices and allocates transmit power, not requiring channel statistics. The method achieves linear speedup and, in CIFAR-10 experiments with 100 clients and heterogeneous channels, delivers up to 8.5x reductions in wall-clock training time compared with uniform participation. Practically, it enables more efficient FL deployments in mobile edge computing and IoT settings by balancing convergence guarantees with communication efficiency.

Abstract

Federated learning (FL) is a useful tool that enables the training of machine learning models over distributed data without having to collect data centrally. When deploying FL in constrained wireless environments, however, intermittent connectivity of devices, heterogeneous connection quality, and non-i.i.d. data can severely slow convergence. In this paper, we consider FL with arbitrary device participation probabilities for each round and show that by weighing each device's update by the reciprocal of their per-round participation probability, we can guarantee convergence to a stationary point. Our bound applies to non-convex loss functions and non-i.i.d. datasets and recovers state-of-the-art convergence rates for both full and uniform partial participation, including linear speedup, with only a single-sided learning rate. Then, using the derived convergence bound, we develop a new online client selection and power allocation algorithm that utilizes the Lyapunov drift-plus-penalty framework to opportunistically minimize a function of the convergence bound and the average communication time under a transmit power constraint. We use optimization over manifold techniques to obtain a solution to the minimization problem. Thanks to the Lyapunov framework, one key feature of the algorithm is that knowledge of the channel distribution is not required and only the instantaneous channel state information needs to be known. Using the CIFAR-10 dataset with varying levels of data heterogeneity, we show through simulations that the communication time can be significantly decreased using our algorithm compared to uniformly random participation, especially for heterogeneous channel conditions.

Communication-Efficient Device Scheduling for Federated Learning Using Lyapunov Optimization

TL;DR

The paper addresses federated learning over constrained wireless networks where device participation is intermittent and data across devices is non-iid. It derives a convergence bound for nonconvex loss under arbitrary participation probabilities and designs a Lyapunov drift-plus-penalty online policy that jointly schedules devices and allocates transmit power, not requiring channel statistics. The method achieves linear speedup and, in CIFAR-10 experiments with 100 clients and heterogeneous channels, delivers up to 8.5x reductions in wall-clock training time compared with uniform participation. Practically, it enables more efficient FL deployments in mobile edge computing and IoT settings by balancing convergence guarantees with communication efficiency.

Abstract

Federated learning (FL) is a useful tool that enables the training of machine learning models over distributed data without having to collect data centrally. When deploying FL in constrained wireless environments, however, intermittent connectivity of devices, heterogeneous connection quality, and non-i.i.d. data can severely slow convergence. In this paper, we consider FL with arbitrary device participation probabilities for each round and show that by weighing each device's update by the reciprocal of their per-round participation probability, we can guarantee convergence to a stationary point. Our bound applies to non-convex loss functions and non-i.i.d. datasets and recovers state-of-the-art convergence rates for both full and uniform partial participation, including linear speedup, with only a single-sided learning rate. Then, using the derived convergence bound, we develop a new online client selection and power allocation algorithm that utilizes the Lyapunov drift-plus-penalty framework to opportunistically minimize a function of the convergence bound and the average communication time under a transmit power constraint. We use optimization over manifold techniques to obtain a solution to the minimization problem. Thanks to the Lyapunov framework, one key feature of the algorithm is that knowledge of the channel distribution is not required and only the instantaneous channel state information needs to be known. Using the CIFAR-10 dataset with varying levels of data heterogeneity, we show through simulations that the communication time can be significantly decreased using our algorithm compared to uniformly random participation, especially for heterogeneous channel conditions.

Paper Structure

This paper contains 20 sections, 5 theorems, 48 equations, 8 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Let Assumptions asmp:Lsmooth--asmp:boundedGradDiv hold with $\gamma$, $T$, $K$, $N$, and $q_n^t$ defined as above. Then, if $\gamma \leq \frac{q_\text{min}}{8LK}$, where we assume the existence of a minimum participation probability $q_\text{min}$ such that such that $q_\text{min} \leq q_n^t$ for al where $\Phi_1 = \frac{1}{c}5 \gamma^2 K L^2\left(\nu^2+6K\epsilon^2 \right)$, $\Phi_2 = \frac{2L\ga

Figures (8)

  • Figure 1: Block diagram of the uplink communication in federated learning over a wireless network.
  • Figure 2: The optimal number of devices chosen each round depends on computation time.
  • Figure 3: Comparison of total communication time for uniform selection vs proposed algorithm on CIFAR-10 dataset.
  • Figure 4: Convergence for heterogeneous data.
  • Figure 5: Convergence over communication rounds/iterations.
  • ...and 3 more figures

Theorems & Definitions (10)

  • Theorem 1
  • proof
  • Corollary 1
  • proof
  • Corollary 2
  • proof
  • Theorem 2
  • proof
  • Lemma 1: reddi2020adaptive
  • proof