Table of Contents
Fetching ...

EcoFed: Efficient Communication for DNN Partitioning-based Federated Learning

Di Wu, Rehmat Ullah, Philip Rodgers, Peter Kilpatrick, Ivor Spence, Blesson Varghese

TL;DR

EcoFed is proposed – a communication efficient framework for DPFL systems that eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time and implements a quantization-based compression technique to reduce the transmission of the activation.

Abstract

Efficiently running federated learning (FL) on resource-constrained devices is challenging since they are required to train computationally intensive deep neural networks (DNN) independently. DNN partitioning-based FL (DPFL) has been proposed as one mechanism to accelerate training where the layers of a DNN (or computation) are offloaded from the device to the server. However, this creates significant communication overheads since the intermediate activation and gradient need to be transferred between the device and the server during training. While current research reduces the communication introduced by DNN partitioning using local loss-based methods, we demonstrate that these methods are ineffective in improving the overall efficiency (communication overhead and training speed) of a DPFL system. This is because they suffer from accuracy degradation and ignore the communication costs incurred when transferring the activation from the device to the server. This article proposes EcoFed - a communication efficient framework for DPFL systems. EcoFed eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time. This reduces the accuracy degradation seen in local loss-based methods. In addition, EcoFed proposes a novel replay buffer mechanism and implements a quantization-based compression technique to reduce the transmission of the activation. It is experimentally demonstrated that EcoFed can reduce the communication cost by up to 133x and accelerate training by up to 21x when compared to classic FL. Compared to vanilla DPFL, EcoFed achieves a 16x communication reduction and 2.86x training time speed-up. EcoFed is available from https://github.com/blessonvar/EcoFed.

EcoFed: Efficient Communication for DNN Partitioning-based Federated Learning

TL;DR

EcoFed is proposed – a communication efficient framework for DPFL systems that eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time and implements a quantization-based compression technique to reduce the transmission of the activation.

Abstract

Efficiently running federated learning (FL) on resource-constrained devices is challenging since they are required to train computationally intensive deep neural networks (DNN) independently. DNN partitioning-based FL (DPFL) has been proposed as one mechanism to accelerate training where the layers of a DNN (or computation) are offloaded from the device to the server. However, this creates significant communication overheads since the intermediate activation and gradient need to be transferred between the device and the server during training. While current research reduces the communication introduced by DNN partitioning using local loss-based methods, we demonstrate that these methods are ineffective in improving the overall efficiency (communication overhead and training speed) of a DPFL system. This is because they suffer from accuracy degradation and ignore the communication costs incurred when transferring the activation from the device to the server. This article proposes EcoFed - a communication efficient framework for DPFL systems. EcoFed eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time. This reduces the accuracy degradation seen in local loss-based methods. In addition, EcoFed proposes a novel replay buffer mechanism and implements a quantization-based compression technique to reduce the transmission of the activation. It is experimentally demonstrated that EcoFed can reduce the communication cost by up to 133x and accelerate training by up to 21x when compared to classic FL. Compared to vanilla DPFL, EcoFed achieves a 16x communication reduction and 2.86x training time speed-up. EcoFed is available from https://github.com/blessonvar/EcoFed.
Paper Structure (27 sections, 19 equations, 9 figures, 14 tables, 1 algorithm)

This paper contains 27 sections, 19 equations, 9 figures, 14 tables, 1 algorithm.

Figures (9)

  • Figure 1: Computation and communication latency in DPFL training under typical (upload/download) network bandwidth. Numerical value above the bars is the percentage of communication latency.
  • Figure 2: The training pipeline of classic FL, vanilla DPFL, local loss-based DPFL and EcoFed for three rounds of training. Classic FL transfers the entire model from the devices to the server at the end of each round. Vanilla DPFL only needs to upload a partitioned device-side model at the end of each round. However, Vanilla DPFL transfers the activation and gradient for each batch sample. Local loss-based DPFL reduces the communication by half since the gradients are computed locally. EcoFed reduces communication further as it transfers the activation only periodically (for example, once in two rounds) and further compresses the size of the activations.
  • Figure 3: EcoFed modules on the device and server.
  • Figure 4: Test accuracy curves of EcoFed and the baselines using VGG11 and ResNet9 in I.I.D. and Non-I.I.D. settings for CIFAR-10 and CIFAR-100 datasets.
  • Figure 5: Dynamic $\rho$ values based on Table \ref{['table:dynamic_rho']} in EcoFed. The results are an average of three independent runs with different random seeds.
  • ...and 4 more figures