Table of Contents
Fetching ...

Pure Exploration in Asynchronous Federated Bandits

Zichen Wang, Chuanhao Li, Chenyu Song, Lianghui Wang, Quanquan Gu, Huazheng Wang

TL;DR

This work proposes the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence, and shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment.

Abstract

We study the federated pure exploration problem of multi-armed bandits and linear bandits, where $M$ agents cooperatively identify the best arm via communicating with the central server. To enhance the robustness against latency and unavailability of agents that are common in practice, we propose the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence. Our theoretical analysis shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment. Moreover, experimental results based on synthetic and real-world data empirically elucidate the effectiveness and communication cost-efficiency of the proposed algorithms.

Pure Exploration in Asynchronous Federated Bandits

TL;DR

This work proposes the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence, and shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment.

Abstract

We study the federated pure exploration problem of multi-armed bandits and linear bandits, where agents cooperatively identify the best arm via communicating with the central server. To enhance the robustness against latency and unavailability of agents that are common in practice, we propose the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence. Our theoretical analysis shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment. Moreover, experimental results based on synthetic and real-world data empirically elucidate the effectiveness and communication cost-efficiency of the proposed algorithms.
Paper Structure (36 sections, 20 theorems, 148 equations, 2 figures)

This paper contains 36 sections, 20 theorems, 148 equations, 2 figures.

Key Result

Theorem 1

With $\gamma = 1/(2MK)$ and exploration bonuses the estimated best arm $\hat{k}^*$ of FAMABPE can satisfy condition (1) and with probability at least $1-\delta$ the sample complexity can be bounded by where is the problem complexity in the MAB Gabillon2012BestAI and The communication cost satisfies $\mathcal{C}(\tau) = \tilde{O}( KM )$.

Figures (2)

  • Figure 1: Synthetic data: Experimental results for federated MAB and federated linear bandits.
  • Figure 2: Experimental results on MovieLens for federated linear bandits.

Theorems & Definitions (35)

  • Theorem 1
  • Remark 1
  • Theorem 2
  • Remark 2
  • Remark 3: Global and local data in the federated MAB
  • Lemma 1: Communication cost
  • proof : Proof of Lemma \ref{['lemmacommunication1']}
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • ...and 25 more