Table of Contents
Fetching ...

Communication Bounds for the Distributed Experts Problem

Zhihao Jia, Qi Pang, Trung Tran, David Woodruff, Zhihao Zhang, Wenting Zheng

TL;DR

This work proposes the first communication-efficient protocols that achieve near-optimal regret in distributed setting where an expert's cost needs to be aggregated across multiple servers, even against a strong adversary who can choose the inputs adaptively.

Abstract

In this work, we study the experts problem in the distributed setting where an expert's cost needs to be aggregated across multiple servers. Our study considers various communication models such as the message-passing model and the broadcast model, along with multiple aggregation functions, such as summing and taking the $\ell_p$ norm of an expert's cost across servers. We propose the first communication-efficient protocols that achieve near-optimal regret in these settings, even against a strong adversary who can choose the inputs adaptively. Additionally, we give a conditional lower bound showing that the communication of our protocols is nearly optimal. Finally, we implement our protocols and demonstrate empirical savings on the HPO-B benchmarks.

Communication Bounds for the Distributed Experts Problem

TL;DR

This work proposes the first communication-efficient protocols that achieve near-optimal regret in distributed setting where an expert's cost needs to be aggregated across multiple servers, even against a strong adversary who can choose the inputs adaptively.

Abstract

In this work, we study the experts problem in the distributed setting where an expert's cost needs to be aggregated across multiple servers. Our study considers various communication models such as the message-passing model and the broadcast model, along with multiple aggregation functions, such as summing and taking the norm of an expert's cost across servers. We propose the first communication-efficient protocols that achieve near-optimal regret in these settings, even against a strong adversary who can choose the inputs adaptively. Additionally, we give a conditional lower bound showing that the communication of our protocols is nearly optimal. Finally, we implement our protocols and demonstrate empirical savings on the HPO-B benchmarks.
Paper Structure (31 sections, 13 theorems, 32 equations, 12 figures, 10 tables, 6 algorithms)

This paper contains 31 sections, 13 theorems, 32 equations, 12 figures, 10 tables, 6 algorithms.

Key Result

Theorem 4.1

For a sampling budget $b_e \in [n]$, with probability $1-\frac{1}{\textrm{poly}(T)}$, the communication cost for DEWA-M is $\tilde{O}(T(b_e+s))$.

Figures (12)

  • Figure 1: Regrets on HPO-B w/ sum aggregation.
  • Figure 2: Regrets on HPO-B w/ max aggregation.
  • Figure 3: Regrets on Gaussian distribution with summation aggregation, non-sparse scenario.
  • Figure 4: Regrets on Gaussian distributions with summation aggregation, sparse scenario.
  • Figure 5: Regret on Gaussian distribution with maximum aggregation, non-sparse scenario.
  • ...and 7 more figures

Theorems & Definitions (24)

  • Theorem 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Theorem 5.1
  • Theorem 5.2
  • Theorem 5.3
  • Theorem 5.4
  • Theorem 5.5
  • Definition A.1
  • Lemma A.2
  • ...and 14 more