Table of Contents
Fetching ...

Contextual Causal Bayesian Optimisation

Vahan Arsenyan, Antoine Grosnit, Haitham Bou-Ammar, Arnak Dalalyan

TL;DR

CoCa-BO introduces a unified Contextual-Causal Bayesian Optimisation framework that jointly searches over mixed policy scopes compatible with a given causal graph and contextual information. It combines POMPS selection via MAB-UCB with HEBO-based Gaussian-process BO inside the chosen scope, and provides worst-case and instance-dependent regret bounds leveraging maximum information gain. The approach demonstrates sublinear regret and improved sample efficiency over CaBO and CoBO across diverse, high-dimensional environments, while remaining robust to moderate noise and scalable via parallelizable policy-scope enumeration. The work advances practical policy learning in settings with causal structure and observable context, enabling more efficient intervention design in complex systems. A public, implementation-ready pipeline supports reproducibility and benchmarking across tasks with varying context and intervention richness.

Abstract

We introduce a unified framework for contextual and causal Bayesian optimisation, which aims to design intervention policies maximising the expectation of a target variable. Our approach leverages both observed contextual information and known causal graph structures to guide the search. Within this framework, we propose a novel algorithm that jointly optimises over policies and the sets of variables on which these policies are defined. This thereby extends and unifies two previously distinct approaches: Causal Bayesian Optimisation and Contextual Bayesian Optimisation, while also addressing their limitations in scenarios that yield suboptimal results. We derive worst-case and instance-dependent high-probability regret bounds for our algorithm. We report experimental results across diverse environments, corroborating that our approach achieves sublinear regret and reduces sample complexity in high-dimensional settings.

Contextual Causal Bayesian Optimisation

TL;DR

CoCa-BO introduces a unified Contextual-Causal Bayesian Optimisation framework that jointly searches over mixed policy scopes compatible with a given causal graph and contextual information. It combines POMPS selection via MAB-UCB with HEBO-based Gaussian-process BO inside the chosen scope, and provides worst-case and instance-dependent regret bounds leveraging maximum information gain. The approach demonstrates sublinear regret and improved sample efficiency over CaBO and CoBO across diverse, high-dimensional environments, while remaining robust to moderate noise and scalable via parallelizable policy-scope enumeration. The work advances practical policy learning in settings with causal structure and observable context, enabling more efficient intervention design in complex systems. A public, implementation-ready pipeline supports reproducibility and benchmarking across tasks with varying context and intervention richness.

Abstract

We introduce a unified framework for contextual and causal Bayesian optimisation, which aims to design intervention policies maximising the expectation of a target variable. Our approach leverages both observed contextual information and known causal graph structures to guide the search. Within this framework, we propose a novel algorithm that jointly optimises over policies and the sets of variables on which these policies are defined. This thereby extends and unifies two previously distinct approaches: Causal Bayesian Optimisation and Contextual Bayesian Optimisation, while also addressing their limitations in scenarios that yield suboptimal results. We derive worst-case and instance-dependent high-probability regret bounds for our algorithm. We report experimental results across diverse environments, corroborating that our approach achieves sublinear regret and reduces sample complexity in high-dimensional settings.
Paper Structure (36 sections, 9 theorems, 108 equations, 7 figures, 3 tables, 2 algorithms)

This paper contains 36 sections, 9 theorems, 108 equations, 7 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Let $\delta\in(0,1/2)$, $\boldsymbol{\Delta} = (\Delta_1,\ldots,\Delta_m)$ be the vector of sub-optimality gaps defined by $\Delta_i = \mu^* - \mu_i$ and $I^*=\{i:\Delta_i=0\}$. Let Assumptions ass:1 and ass:2 be fulfilled. There exist constants $\mathsf A_1$, $\mathsf A_2$, $\mathsf A_3$ depending for every $i\in[m]$ and every $n\in\mathbb N^*$, then regret eq:reg_def of UCB-BO alg:coca_bo appli

Figures (7)

  • Figure 1: An SCM and its DAG $\mathcal{G}$ for which CoBO and CaBO fail to converge to the optimum.
  • Figure 2: At each iteration $t$, we select a POMPS using the running average of the UCB evaluated at the points $\mathbf{x}_t$ and $\mathbf{c}_t$, plus an exploration term $\rho_i(n_i)/\sqrt{n_i}$, where $n_i$ is the number of times POMPS $i$ has been chosen so far. We then implement the intervention $\mathbf{X}_{i_t}$, observe the target $y_t$ under this intervention, and update the parameters of the algorithm. For more details please see \ref{['alg:coca_bo']}.
  • Figure 3: Top: frequency of selecting the corresponding POMPS or POMIS (omitted when there is only one candidate). Bottom: time-normalised cumulative regret $\bar{R}_T$.
  • Figure 4: Causal graph of PSA level. White nodes: intervenable variables; gray nodes: observable variables; shaded node: target variable.
  • Figure 5: Robustness to the noise: Normalised regret under varying noise levels.
  • ...and 2 more figures

Theorems & Definitions (20)

  • Definition 1: Mixed policy scope
  • Definition 2
  • Definition 3: Possibly-Optimal MPS (POMPS)
  • Theorem 1
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • ...and 10 more