Better Bounds for the Distributed Experts Problem

David P. Woodruff; Samson Zhou

Better Bounds for the Distributed Experts Problem

David P. Woodruff, Samson Zhou

TL;DR

This paper gives a protocol that achieves regret roughly $R\gtrsim\frac{1}{\sqrt{T}\cdot\text{poly}\log(nsT)}$, using $\mathcal{O}\left(\frac{n}{R^2}+\frac{s}{R^2}\right)\cdot\max(s^{1-2/p,1)$ bits of communication, which improves on previous work.

Abstract

In this paper, we study the distributed experts problem, where $n$ experts are distributed across $s$ servers for $T$ timesteps. The loss of each expert at each time $t$ is the $\ell_p$ norm of the vector that consists of the losses of the expert at each of the $s$ servers at time $t$. The goal is to minimize the regret $R$, i.e., the loss of the distributed protocol compared to the loss of the best expert, amortized over the all $T$ times, while using the minimum amount of communication. We give a protocol that achieves regret roughly $R\gtrsim\frac{1}{\sqrt{T}\cdot\text{poly}\log(nsT)}$, using $\mathcal{O}\left(\frac{n}{R^2}+\frac{s}{R^2}\right)\cdot\max(s^{1-2/p},1)\cdot\text{poly}\log(nsT)$ bits of communication, which improves on previous work.

Better Bounds for the Distributed Experts Problem

TL;DR

This paper gives a protocol that achieves regret roughly

, using

bits of communication, which improves on previous work.

Abstract

In this paper, we study the distributed experts problem, where

experts are distributed across

servers for

timesteps. The loss of each expert at each time

is the

norm of the vector that consists of the losses of the expert at each of the

servers at time

. The goal is to minimize the regret

, i.e., the loss of the distributed protocol compared to the loss of the best expert, amortized over the all

times, while using the minimum amount of communication. We give a protocol that achieves regret roughly

, using

bits of communication, which improves on previous work.

Paper Structure (19 sections, 18 theorems, 39 equations, 2 figures, 4 algorithms)

This paper contains 19 sections, 18 theorems, 39 equations, 2 figures, 4 algorithms.

Introduction
Distributed Online Learning with Experts in the Coordinator Model
Online learning with experts.
Communication complexity.
Our Contributions
Algorithmic and technical novelties.
Preliminaries
Related Works
Classic Learning with Experts
Randomized weighted majority.
Other randomized variants.
Learning with Experts in Big Data Settings
Learning with experts on data streams.
Multi-armed bandits.
Learning in streams.
...and 4 more sections

Key Result

Theorem 1.1

Let $b>a>0$ be fixed constants and suppose $\ell_i(j,t)\in[a,b]$ for all $t\in[T]$, $i\in[n]$ and $j\in[s]$. There exists an algorithm that achieves expected regret at most $\mathcal{O}\left(s^{1/p}\sqrt{\frac{\log n}{T}}\right)$ and with high probability, uses total communication at most $\mathcal{

Figures (2)

Figure 1: Our work is the first to study $\ell_p$ loss in the coordinator model; for the special case of $p=1$, we obtain better regret-communication tradeoffs for regret $R$.
Figure :

Theorems & Definitions (32)

Theorem 1.1
Theorem 1.2
Theorem 1.3
Theorem 1.4: Chernoff Bounds
Definition 1.5: Exponential random variable
Lemma 1.5
proof
Theorem 1.6
Lemma 3.1
proof
...and 22 more

Better Bounds for the Distributed Experts Problem

TL;DR

Abstract

Better Bounds for the Distributed Experts Problem

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (32)