Table of Contents
Fetching ...

Decentralised convex optimisation with probability-proportional-to-size quantization

Dmitrii Pasechniuk, Pavel Dvurechensky, César A. Uribe, Alexander Gasnikov

TL;DR

A novel quantization method that transforms a vector into a sample of components' indices drawn from a categorical distribution with probabilities proportional to values at those components that is focused on affine-constrained convex optimisation.

Abstract

Communication is one of the bottlenecks of distributed optimisation and learning. To overcome this bottleneck, we propose a novel quantization method that transforms a vector into a sample of components' indices drawn from a categorical distribution with probabilities proportional to values at those components. Then, we propose a primal and a primal-dual accelerated stochastic gradient methods that use our proposed quantization, and derive their convergence rates in terms of probabilities of large deviations. We focus on affine-constrained convex optimisation and its application to decentralised distributed optimisation problems. To illustrate the work of our algorithm, we apply it to the decentralised computation of semi-discrete entropy regularized Wasserstein barycenters.

Decentralised convex optimisation with probability-proportional-to-size quantization

TL;DR

A novel quantization method that transforms a vector into a sample of components' indices drawn from a categorical distribution with probabilities proportional to values at those components that is focused on affine-constrained convex optimisation.

Abstract

Communication is one of the bottlenecks of distributed optimisation and learning. To overcome this bottleneck, we propose a novel quantization method that transforms a vector into a sample of components' indices drawn from a categorical distribution with probabilities proportional to values at those components. Then, we propose a primal and a primal-dual accelerated stochastic gradient methods that use our proposed quantization, and derive their convergence rates in terms of probabilities of large deviations. We focus on affine-constrained convex optimisation and its application to decentralised distributed optimisation problems. To illustrate the work of our algorithm, we apply it to the decentralised computation of semi-discrete entropy regularized Wasserstein barycenters.

Paper Structure

This paper contains 19 sections, 15 theorems, 38 equations, 2 figures, 1 table, 3 algorithms.

Key Result

Lemma 1

Let $\sigma_{r,M}^2 = 50 \left(\frac{2 (1 - 1/n) B^2}{\mathrm{e} M} + \frac{\sigma^2 }{r}\right)$. Then, it holds that

Figures (2)

  • Figure 1: Convergence curves of Algorithm \ref{['alg:devdec']} and AGM for different network topology, number of nodes, and sampling schemes
  • Figure 2: Visualised approximate barycenters obtained by Algorithm \ref{['alg:devdec']} and comparison with AGM

Theorems & Definitions (19)

  • Lemma 1
  • proof
  • Lemma 2: Theorem 3.1.4 dvinskikh2021decentralized
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • proof
  • Theorem 4
  • proof
  • Corollary 1
  • ...and 9 more