Rate-Constrained Quantization for Communication-Efficient Federated Learning

Shayan Mohajer Hamidi; Ali Bereyhi

Rate-Constrained Quantization for Communication-Efficient Federated Learning

Shayan Mohajer Hamidi, Ali Bereyhi

TL;DR

This work develops a novel quantized FL framework, called rate-constrained federated learning (RC-FED), in which the conventional entropy-constrained scalar quantization technique is deployed to quantize the gradients subject to both fidelity and data rate constraints.

Abstract

Quantization is a common approach to mitigate the communication cost of federated learning (FL). In practice, the quantized local parameters are further encoded via an entropy coding technique, such as Huffman coding, for efficient data compression. In this case, the exact communication overhead is determined by the bit rate of the encoded gradients. Recognizing this fact, this work deviates from the existing approaches in the literature and develops a novel quantized FL framework, called \textbf{r}ate-\textbf{c}onstrained \textbf{fed}erated learning (RC-FED), in which the gradients are quantized subject to both fidelity and data rate constraints. We formulate this scheme, as a joint optimization in which the quantization distortion is minimized while the rate of encoded gradients is kept below a target threshold. This enables for a tunable trade-off between quantization distortion and communication cost. We analyze the convergence behavior of RC-FED, and show its superior performance against baseline quantized FL schemes on several datasets.

Rate-Constrained Quantization for Communication-Efficient Federated Learning

TL;DR

Abstract

Paper Structure (21 sections, 3 theorems, 20 equations, 1 figure, 1 algorithm)

This paper contains 21 sections, 3 theorems, 20 equations, 1 figure, 1 algorithm.

Introduction
Related Work
Contributions
Notation
Preliminaries
Universal Quantization
Source-encoded Transmission
Rate-Constrained FL
Gradient Normalization
Gradient Quantization with Constrained Rate
Iterative Optimization
Rate-constrained vs Unconstrained
Gradient Transmission
Gradient Accumulation
Convergence Analysis
...and 6 more sections

Key Result

Theorem 1

Let assumptions (A-I) to (A-IV) hold. Assume that all clients perform $e$ local iterations, and that $\eta_t = \frac{2}{\rho (t+\gamma)}$ for $\gamma = \max \{ 8{L}/{\rho} , e \}-1$. Define the optimality gap at round $t$ as $\Delta_t= \mathbb{E} \{ f(\boldsymbol{\theta}_{t}) - f(\boldsymbol{\theta} where $C$ is given by

Figures (1)

Figure 1: Test accuracy vs communication costs in Gb for RC-FED and baselines over (a) CIFAR-10, and (b) FEMNIST.

Theorems & Definitions (4)

Theorem 1
proof
Lemma 1
Lemma 2

Rate-Constrained Quantization for Communication-Efficient Federated Learning

TL;DR

Abstract

Rate-Constrained Quantization for Communication-Efficient Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (4)