Table of Contents
Fetching ...

Best-Arm Identification with Noisy Actuation

Merve Karakas, Osama Hanna, Lin F. Yang, Christina Fragouli

Abstract

In this paper, we consider a multi-armed bandit (MAB) instance and study how to identify the best arm when arm commands are conveyed from a central learner to a distributed agent over a discrete memoryless channel (DMC). Depending on the agent capabilities, we provide communication schemes along with their analysis, which interestingly relate to the zero-error capacity of the underlying DMC.

Best-Arm Identification with Noisy Actuation

Abstract

In this paper, we consider a multi-armed bandit (MAB) instance and study how to identify the best arm when arm commands are conveyed from a central learner to a distributed agent over a discrete memoryless channel (DMC). Depending on the agent capabilities, we provide communication schemes along with their analysis, which interestingly relate to the zero-error capacity of the underlying DMC.

Paper Structure

This paper contains 34 sections, 14 theorems, 49 equations, 1 figure, 1 algorithm.

Key Result

Lemma 1

If there exist $\mu\neq \mu'$ such that $W\mu=W\mu'$ but $\arg\max_i \mu_i\neq \arg\max_i \mu'_i$, then no algorithm can be $\delta$-correct for all instances (for any $\delta<1$). $\blacktriangleleft$$\blacktriangleleft$

Figures (1)

  • Figure 1: Example of a one-sided typewriter channel over alphabet $\mathcal{X}=\{0,\dots,4\}$ (left) and its confusability graph $C_5$ (right)

Theorems & Definitions (18)

  • Remark 1
  • Lemma 1: Non-identifiability under non-injective mixing
  • Proposition 1
  • Remark 2
  • Lemma 2
  • Proposition 2
  • Corollary 1
  • Corollary 2
  • Remark 3: Non–zero-error preshared codes
  • Lemma 3
  • ...and 8 more