Table of Contents
Fetching ...

Adversarial Online Learning with Temporal Feedback Graphs

Khashayar Gatmiry, Jon Schneider

TL;DR

This work extends online learning with expert advice to temporal feedback graphs, where the learner's decision at round $t$ can only depend on a subset $S_t$ of past losses. It introduces a novel algorithm that partitions losses across maximal orders (subgraphs) and leverages an upper-bound convex program, with a dual formulation yielding sparsity and efficient implementation via a basis of at most $T$ orders; it also develops two lower-bound schemes, $\mathsf{LB}(\mathcal{S})$ and $\mathsf{ILB}(\mathcal{S})$, and proves a near-tight gap in many settings. For transitive graphs, the authors provide an efficient, implementable algorithm with regret bound $O\left(\mathsf{UB}(\mathcal{S})\sqrt{\log K}\right)$ and show a matching tight lower bound up to a constant factor, thereby establishing the optimal learning rate for this important class. The results unify and extend batched and delayed feedback models under a graph-theoretic view, offering practically efficient methods for structured partial information in online learning.

Abstract

We study a variant of prediction with expert advice where the learner's action at round $t$ is only allowed to depend on losses on a specific subset of the rounds (where the structure of which rounds' losses are visible at time $t$ is provided by a directed "feedback graph" known to the learner). We present a novel learning algorithm for this setting based on a strategy of partitioning the losses across sub-cliques of this graph. We complement this with a lower bound that is tight in many practical settings, and which we conjecture to be within a constant factor of optimal. For the important class of transitive feedback graphs, we prove that this algorithm is efficiently implementable and obtains the optimal regret bound (up to a universal constant).

Adversarial Online Learning with Temporal Feedback Graphs

TL;DR

This work extends online learning with expert advice to temporal feedback graphs, where the learner's decision at round can only depend on a subset of past losses. It introduces a novel algorithm that partitions losses across maximal orders (subgraphs) and leverages an upper-bound convex program, with a dual formulation yielding sparsity and efficient implementation via a basis of at most orders; it also develops two lower-bound schemes, and , and proves a near-tight gap in many settings. For transitive graphs, the authors provide an efficient, implementable algorithm with regret bound and show a matching tight lower bound up to a constant factor, thereby establishing the optimal learning rate for this important class. The results unify and extend batched and delayed feedback models under a graph-theoretic view, offering practically efficient methods for structured partial information in online learning.

Abstract

We study a variant of prediction with expert advice where the learner's action at round is only allowed to depend on losses on a specific subset of the rounds (where the structure of which rounds' losses are visible at time is provided by a directed "feedback graph" known to the learner). We present a novel learning algorithm for this setting based on a strategy of partitioning the losses across sub-cliques of this graph. We complement this with a lower bound that is tight in many practical settings, and which we conjecture to be within a constant factor of optimal. For the important class of transitive feedback graphs, we prove that this algorithm is efficiently implementable and obtains the optimal regret bound (up to a universal constant).
Paper Structure (29 sections, 20 theorems, 50 equations, 1 algorithm)

This paper contains 29 sections, 20 theorems, 50 equations, 1 algorithm.

Key Result

Lemma 1

Let ${\boldsymbol \ell} = (\ell_1, \ell_2, \dots, \ell_T)$ be a sequence of losses such that each $\ell_t \in [0, \lambda_t]^K$. If we let $\mathcal{A}$ be the Hedge algorithm initialized with learning rate $\eta = O\left(\sqrt{(\log K)/\sum_{t=1}^T \lambda_t^2}\right)$, then

Theorems & Definitions (39)

  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • Lemma 3
  • Lemma 4
  • proof
  • ...and 29 more