Table of Contents
Fetching ...

Linear Causal Bandits: Unknown Graph and Soft Interventions

Zirui Yan, Ali Tajer

TL;DR

A novel way of designing a computationally efficient CB algorithm is presented, addressing a challenge that the existing CB algorithms using soft interventions face.

Abstract

Designing causal bandit algorithms depends on two central categories of assumptions: (i) the extent of information about the underlying causal graphs and (ii) the extent of information about interventional statistical models. There have been extensive recent advances in dispensing with assumptions on either category. These include assuming known graphs but unknown interventional distributions, and the converse setting of assuming unknown graphs but access to restrictive hard/$\operatorname{do}$ interventions, which removes the stochasticity and ancestral dependencies. Nevertheless, the problem in its general form, i.e., unknown graph and unknown stochastic intervention models, remains open. This paper addresses this problem and establishes that in a graph with $N$ nodes, maximum in-degree $d$ and maximum causal path length $L$, after $T$ interaction rounds the regret upper bound scales as $\tilde{\mathcal{O}}((cd)^{L-\frac{1}{2}}\sqrt{T} + d + RN)$ where $c>1$ is a constant and $R$ is a measure of intervention power. A universal minimax lower bound is also established, which scales as $Ω(d^{L-\frac{3}{2}}\sqrt{T})$. Importantly, the graph size $N$ has a diminishing effect on the regret as $T$ grows. These bounds have matching behavior in $T$, exponential dependence on $L$, and polynomial dependence on $d$ (with the gap $d\ $). On the algorithmic aspect, the paper presents a novel way of designing a computationally efficient CB algorithm, addressing a challenge that the existing CB algorithms using soft interventions face.

Linear Causal Bandits: Unknown Graph and Soft Interventions

TL;DR

A novel way of designing a computationally efficient CB algorithm is presented, addressing a challenge that the existing CB algorithms using soft interventions face.

Abstract

Designing causal bandit algorithms depends on two central categories of assumptions: (i) the extent of information about the underlying causal graphs and (ii) the extent of information about interventional statistical models. There have been extensive recent advances in dispensing with assumptions on either category. These include assuming known graphs but unknown interventional distributions, and the converse setting of assuming unknown graphs but access to restrictive hard/ interventions, which removes the stochasticity and ancestral dependencies. Nevertheless, the problem in its general form, i.e., unknown graph and unknown stochastic intervention models, remains open. This paper addresses this problem and establishes that in a graph with nodes, maximum in-degree and maximum causal path length , after interaction rounds the regret upper bound scales as where is a constant and is a measure of intervention power. A universal minimax lower bound is also established, which scales as . Importantly, the graph size has a diminishing effect on the regret as grows. These bounds have matching behavior in , exponential dependence on , and polynomial dependence on (with the gap ). On the algorithmic aspect, the paper presents a novel way of designing a computationally efficient CB algorithm, addressing a challenge that the existing CB algorithms using soft interventions face.

Paper Structure

This paper contains 35 sections, 22 theorems, 210 equations, 8 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Under Assumptions assum:unknowngraph--assum:noisemodel, the GA-LCB-SL algorithm ensures that where $\kappa$ is defined as

Figures (8)

  • Figure 1: Example of hierarchical graph.
  • Figure 2: Cumulative regret with $L=2$.
  • Figure 3: Cumulative regret with $L=4$.
  • Figure 4: Cumulative regret with $L=6$.
  • Figure 5: Cumulative regret with different length $L$ under hierarchical graph with $d=2$.
  • ...and 3 more figures

Theorems & Definitions (23)

  • Theorem 1: Achievable Graph Skeleton Learning
  • Theorem 2: Achievable Regret
  • Corollary 1: Achievable Regret -- Known Skeleton
  • Theorem 3: Regret Lower Bound
  • Corollary 2: Achievable Regret - Graph-Dependent
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Definition 1
  • Lemma 4
  • ...and 13 more