SignSGD with Federated Voting

Chanho Park; H. Vincent Poor; Namyoon Lee

SignSGD with Federated Voting

Chanho Park, H. Vincent Poor, Namyoon Lee

TL;DR

The proposed signSGD-FV algorithm has a theoretical convergence guarantee even when edge devices use heterogeneous mini-batch sizes, and a unified convergence rate analysis framework applicable to scenarios where the estimated weights are known to the parameter server either perfectly or imperfectly is provided.

Abstract

Distributed learning is commonly used for accelerating model training by harnessing the computational capabilities of multiple-edge devices. However, in practical applications, the communication delay emerges as a bottleneck due to the substantial information exchange required between workers and a central parameter server. SignSGD with majority voting (signSGD-MV) is an effective distributed learning algorithm that can significantly reduce communication costs by one-bit quantization. However, due to heterogeneous computational capabilities, it fails to converge when the mini-batch sizes differ among workers. To overcome this, we propose a novel signSGD optimizer with \textit{federated voting} (signSGD-FV). The idea of federated voting is to exploit learnable weights to perform weighted majority voting. The server learns the weights assigned to the edge devices in an online fashion based on their computational capabilities. Subsequently, these weights are employed to decode the signs of the aggregated local gradients in such a way to minimize the sign decoding error probability. We provide a unified convergence rate analysis framework applicable to scenarios where the estimated weights are known to the parameter server either perfectly or imperfectly. We demonstrate that the proposed signSGD-FV algorithm has a theoretical convergence guarantee even when edge devices use heterogeneous mini-batch sizes. Experimental results show that signSGD-FV outperforms signSGD-MV, exhibiting a faster convergence rate, especially in heterogeneous mini-batch sizes.

SignSGD with Federated Voting

TL;DR

Abstract

Paper Structure (26 sections, 7 theorems, 57 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 26 sections, 7 theorems, 57 equations, 6 figures, 2 tables, 1 algorithm.

Introduction
Related Works
Contributions
Preliminaries
SignSGD-MV
An Upper Bound on SignSGD-MV
SignSGD-FV
Algorithm
Learning cross-over probabilities
Unified Convergence Rate Analysis
Assumptions
Convergence Rate Analysis
Decoding Error Probability Bounds for WMV Aggregation
Decoding Error Probability Bounds with Imperfect Knowledge of $p_{m,n}^t$
Simulation Results
...and 11 more sections

Key Result

Theorem 1

Let $A_n^t:\{-1,+1\}^M\rightarrow \{-1,+1\}$ be a binary sign aggregation function applied to the $n$th gradient component at iteration $t$. This binary sign aggregation function produces an estimate of the true gradient sign $U_n^t$, i.e., Using the estimated gradient sign $\hat{U}_n^t$, the maximum of the sign decoding error probability over all coordinates and iterations is denoted by Then, w

Figures (6)

Figure 1: An illustration of signSGD-FV.
Figure 2: The coding-theoretic interpretation of signSGD-FV.
Figure 3: Test accuracy vs. training rounds varying the batch mode with $M = 15$ and $T_\mathsf{in} = 100$.
Figure 4: Test accuracy vs. training rounds on signSGD-FV varying the number of workers where the batch mode is 3 and $T_\mathsf{in} = 100$.
Figure 5: Test accuracy comparison on CIFAR-10 dataset by varying the uncertainty of estimated LLR weights for the batch mode 3 and $M = 15$.
...and 1 more figures

Theorems & Definitions (19)

Theorem 1: Universal convergence rate
proof
Lemma 1: Large deviation bound
proof
Theorem 2: Decoding error bound of WMV aggregation
proof
Corollary 1
Lemma 2: Upper bound on the computing error probability
proof
Corollary 2: Decoding error bound with mini-batch sizes
...and 9 more

SignSGD with Federated Voting

TL;DR

Abstract

SignSGD with Federated Voting

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (19)