Table of Contents
Fetching ...

Device Sampling and Resource Optimization for Federated Learning in Cooperative Edge Networks

Su Wang, Roberto Morabito, Seyyedali Hosseinalipour, Mung Chiang, Christopher G. Brinton

TL;DR

This paper tackles federated learning at the wireless edge under heterogeneous compute/communication resources and overlapping local data by introducing a joint sampling and D2D data offloading framework. It develops theoretical convergence bounds for the offloading subproblem and solves it via a sequential convex optimizer, then learns an effective sampling strategy with a graph-convolutional network that accounts for network structure and data similarity. Empirical results on MNIST/Fashion-MNIST and a real IoT testbed show that the proposed method improves FedL accuracy, accelerates convergence, and reduces data processing and energy consumption compared to baselines, approaching or surpassing FedL with all nodes in some scenarios. This work enables scalable, resource-efficient FedL in large-scale cooperative edge networks while accommodating privacy considerations and data diversity through controlled offloading.

Abstract

The conventional federated learning (FedL) architecture distributes machine learning (ML) across worker devices by having them train local models that are periodically aggregated by a server. FedL ignores two important characteristics of contemporary wireless networks, however: (i) the network may contain heterogeneous communication/computation resources, and (ii) there may be significant overlaps in devices' local data distributions. In this work, we develop a novel optimization methodology that jointly accounts for these factors via intelligent device sampling complemented by device-to-device (D2D) offloading. Our optimization methodology aims to select the best combination of sampled nodes and data offloading configuration to maximize FedL training accuracy while minimizing data processing and D2D communication resource consumption subject to realistic constraints on the network topology and device capabilities. Theoretical analysis of the D2D offloading subproblem leads to new FedL convergence bounds and an efficient sequential convex optimizer. Using these results, we develop a sampling methodology based on graph convolutional networks (GCNs) which learns the relationship between network attributes, sampled nodes, and D2D data offloading to maximize FedL accuracy. Through evaluation on popular datasets and real-world network measurements from our edge testbed, we find that our methodology outperforms popular device sampling methodologies from literature in terms of ML model performance, data processing overhead, and energy consumption.

Device Sampling and Resource Optimization for Federated Learning in Cooperative Edge Networks

TL;DR

This paper tackles federated learning at the wireless edge under heterogeneous compute/communication resources and overlapping local data by introducing a joint sampling and D2D data offloading framework. It develops theoretical convergence bounds for the offloading subproblem and solves it via a sequential convex optimizer, then learns an effective sampling strategy with a graph-convolutional network that accounts for network structure and data similarity. Empirical results on MNIST/Fashion-MNIST and a real IoT testbed show that the proposed method improves FedL accuracy, accelerates convergence, and reduces data processing and energy consumption compared to baselines, approaching or surpassing FedL with all nodes in some scenarios. This work enables scalable, resource-efficient FedL in large-scale cooperative edge networks while accommodating privacy considerations and data diversity through controlled offloading.

Abstract

The conventional federated learning (FedL) architecture distributes machine learning (ML) across worker devices by having them train local models that are periodically aggregated by a server. FedL ignores two important characteristics of contemporary wireless networks, however: (i) the network may contain heterogeneous communication/computation resources, and (ii) there may be significant overlaps in devices' local data distributions. In this work, we develop a novel optimization methodology that jointly accounts for these factors via intelligent device sampling complemented by device-to-device (D2D) offloading. Our optimization methodology aims to select the best combination of sampled nodes and data offloading configuration to maximize FedL training accuracy while minimizing data processing and D2D communication resource consumption subject to realistic constraints on the network topology and device capabilities. Theoretical analysis of the D2D offloading subproblem leads to new FedL convergence bounds and an efficient sequential convex optimizer. Using these results, we develop a sampling methodology based on graph convolutional networks (GCNs) which learns the relationship between network attributes, sampled nodes, and D2D data offloading to maximize FedL accuracy. Through evaluation on popular datasets and real-world network measurements from our edge testbed, we find that our methodology outperforms popular device sampling methodologies from literature in terms of ML model performance, data processing overhead, and energy consumption.
Paper Structure (33 sections, 3 theorems, 27 equations, 14 figures, 3 tables, 1 algorithm)

This paper contains 33 sections, 3 theorems, 27 equations, 14 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Assuming $\eta \leq {\beta}^{-1}$, the upper-bound on the difference between $\mathbf{w}_{\mathcal{S}}(t)$ and $\mathbf{v}_k(t)$ within the local update period before the $k$-th global aggregation, $t \in \{(k-1)\tau+1,...,k\tau\}$, is given by: where $\Upsilon (y,k) \triangleq \delta_{\mathcal{S}}(y) (2^{y-1-(k-1)\tau}-1)$, and

Figures (14)

  • Figure 1: Architecture of conventional federated learning (FedL).
  • Figure 2: A motivating example of a wireless network composed of 5 connected vehicles and an edge server. The server can only sample two vehicles to participate in FedL training.
  • Figure 3: Overview of the joint sampling and offloading methodology developed in Sec. \ref{['s:p1']}&\ref{['s:p2']}. During model construction, our methodology trains a GCN, using various network realizations and sampled sets of nodes with the data offloading optimization from Sec. \ref{['s:p1']}. At the implementation stage, the target network uses the GCN-based algorithm developed in Sec. \ref{['s:p2']} to obtain a sampled set of devices, which then undergo the D2D data offloading optimization process. Finally, we apply the results of the sampling and D2D data offloading processes for FedL, yielding a global ML model after training completion.
  • Figure 4: Architecture of our GCN-branch sampling algorithm. Given a target network, GCN-branch extracts node and edge features, passing them through two GCN layers, each of which convolves features in local neighborhoods. This process returns raw output probabilities that we filter, based on data quantity, connectivity-similarity matrix, and centroid differences, to obtain a sampled set of nodes in the interpretation stage. With the resulting sampled set, we perform the D2D data offloading process from Sec. \ref{['s:p1']}, yielding an output ML model training process for FedL.
  • Figure 5: IoT testbed used to generate device and link characteristics.
  • ...and 9 more figures

Theorems & Definitions (8)

  • Definition 1: Difference between sampled and unsampled gradients
  • Definition 2: Difference between sampled and unsampled gradients
  • Theorem 1: Upper bound on the difference between sampled FedL and centralized learning
  • proof
  • Corollary 1: Upper bound on the difference between sampled FedL and the optimal
  • proof
  • Proposition 1: Upper bound on the difference between local gradients
  • proof