Table of Contents
Fetching ...

Q-GADMM: Quantized Group ADMM for Communication Efficient Decentralized Machine Learning

Anis Elgabli, Jihong Park, Amrit S. Bedi, Chaouki Ben Issaid, Mehdi Bennis, Vaneet Aggarwal

TL;DR

A novel stochastic quantization method is developed to adaptively adjust model quantization levels and their probabilities, while proving the convergence of Q-GADMM for convex objective functions.

Abstract

In this article, we propose a communication-efficient decentralized machine learning (ML) algorithm, coined quantized group ADMM (Q-GADMM). To reduce the number of communication links, every worker in Q-GADMM communicates only with two neighbors, while updating its model via the group alternating direction method of multipliers (GADMM). Moreover, each worker transmits the quantized difference between its current model and its previously quantized model, thereby decreasing the communication payload size. However, due to the lack of centralized entity in decentralized ML, the spatial sparsity and payload compression may incur error propagation, hindering model training convergence. To overcome this, we develop a novel stochastic quantization method to adaptively adjust model quantization levels and their probabilities, while proving the convergence of Q-GADMM for convex objective functions. Furthermore, to demonstrate the feasibility of Q-GADMM for non-convex and stochastic problems, we propose quantized stochastic GADMM (Q-SGADMM) that incorporates deep neural network architectures and stochastic sampling. Simulation results corroborate that Q-GADMM significantly outperforms GADMM in terms of communication efficiency while achieving the same accuracy and convergence speed for a linear regression task. Similarly, for an image classification task using DNN, Q-SGADMM achieves significantly less total communication cost with identical accuracy and convergence speed compared to its counterpart without quantization, i.e., stochastic GADMM (SGADMM).

Q-GADMM: Quantized Group ADMM for Communication Efficient Decentralized Machine Learning

TL;DR

A novel stochastic quantization method is developed to adaptively adjust model quantization levels and their probabilities, while proving the convergence of Q-GADMM for convex objective functions.

Abstract

In this article, we propose a communication-efficient decentralized machine learning (ML) algorithm, coined quantized group ADMM (Q-GADMM). To reduce the number of communication links, every worker in Q-GADMM communicates only with two neighbors, while updating its model via the group alternating direction method of multipliers (GADMM). Moreover, each worker transmits the quantized difference between its current model and its previously quantized model, thereby decreasing the communication payload size. However, due to the lack of centralized entity in decentralized ML, the spatial sparsity and payload compression may incur error propagation, hindering model training convergence. To overcome this, we develop a novel stochastic quantization method to adaptively adjust model quantization levels and their probabilities, while proving the convergence of Q-GADMM for convex objective functions. Furthermore, to demonstrate the feasibility of Q-GADMM for non-convex and stochastic problems, we propose quantized stochastic GADMM (Q-SGADMM) that incorporates deep neural network architectures and stochastic sampling. Simulation results corroborate that Q-GADMM significantly outperforms GADMM in terms of communication efficiency while achieving the same accuracy and convergence speed for a linear regression task. Similarly, for an image classification task using DNN, Q-SGADMM achieves significantly less total communication cost with identical accuracy and convergence speed compared to its counterpart without quantization, i.e., stochastic GADMM (SGADMM).

Paper Structure

This paper contains 22 sections, 3 theorems, 103 equations, 9 figures, 1 algorithm.

Key Result

Lemma 1

At $k+1$ iteration of Q-GADMM, the optimality gap satisfies where $\textsf{LB}_1$ and $\textsf{UB}_1$ are the lower and upper bounds, respectively, given by Eq: Lemma1_LB0 and Eq: Lemma1_UB0.

Figures (9)

  • Figure 1: An illustration of (a) quantized GADMM (Q-GADMM) operations, in which at every iteration $k$ every worker communicates with two neighbors, (b) quantizing the difference between the current model and the previously quantized model ($\boldsymbol{\theta}_n^k$ and $\hat{\boldsymbol{\theta}}_n^{k-1}$) with radius $R_n^k$(the infinity norm of the model difference).
  • Figure 2: Linear regression results showing: (a) loss $(|F-F^*|)$ w.r.t. # number of communication rounds; (b) loss w.r.t. number of transmitted bits; and (c) energy efficiency (loss w.r.t consumed energy).
  • Figure 3: Linear regression results showing: CDF of the consumed energy to achieve the loss value of $10^{-4}$, (a) system bandwidth is 10MHz, (b) system bandwidth is 2MHz, and (c) system bandwidth is 1MHz
  • Figure 4: Image classification results showing: (a) test accuracy w.r.t. # number of communication rounds; (b) test accuracy w.r.t. number of transmitted bits; and (c) energy efficiency (test accuracy w.r.t consumed energy) when the system bandwidth is 40MHz.
  • Figure 5: Image classification results showing: CDF of the consumed energy to achieve $90\%$ accuracy, (a) system bandwidth is 400MHz, system bandwidth is 100MHz, and (c) system bandwidth is 40MHz
  • ...and 4 more figures

Theorems & Definitions (3)

  • Lemma 1
  • Theorem 1
  • Theorem 2