Table of Contents
Fetching ...

Causal Bayesian Optimization with Unknown Graphs

Jean Durand, Yashas Annadani, Stefan Bauer, Sonali Parbhoo

TL;DR

This work introduces CBO-U, a scalable causal Bayesian optimization framework for scenarios where the causal graph is unknown. By learning a Bayesian posterior over the direct parents of the target variable $Y$ and employing interventions on these parents, the method achieves optimization performance equivalent to full-graph methods while scaling to graphs with up to 100 nodes. It provides a closed-form posterior in the linear case and a GP-based random Fourier feature approximation for nonlinear cases, coupled with a prior over parent sets derived from doubly robust causal feature selection. The approach leverages interventional data and do-calculus to refine both the surrogate model and the causal-parent posterior, demonstrating competitive results across synthetic, semi-synthetic, and real-world-like networks. The framework broadens the applicability of CBO in real-world domains where causal structure is incomplete or uncertain, with a clear path for future extensions to broader intervention types and partial discovery.

Abstract

Causal Bayesian Optimization (CBO) is a methodology designed to optimize an outcome variable by leveraging known causal relationships through targeted interventions. Traditional CBO methods require a fully and accurately specified causal graph, which is a limitation in many real-world scenarios where such graphs are unknown. To address this, we propose a new method for the CBO framework that operates without prior knowledge of the causal graph. Consistent with causal bandit theory, we demonstrate through theoretical analysis and that focusing on the direct causal parents of the target variable is sufficient for optimization, and provide empirical validation in the context of CBO. Furthermore we introduce a new method that learns a Bayesian posterior over the direct parents of the target variable. This allows us to optimize the outcome variable while simultaneously learning the causal structure. Our contributions include a derivation of the closed-form posterior distribution for the linear case. In the nonlinear case where the posterior is not tractable, we present a Gaussian Process (GP) approximation that still enables CBO by inferring the parents of the outcome variable. The proposed method performs competitively with existing benchmarks and scales well to larger graphs, making it a practical tool for real-world applications where causal information is incomplete.

Causal Bayesian Optimization with Unknown Graphs

TL;DR

This work introduces CBO-U, a scalable causal Bayesian optimization framework for scenarios where the causal graph is unknown. By learning a Bayesian posterior over the direct parents of the target variable and employing interventions on these parents, the method achieves optimization performance equivalent to full-graph methods while scaling to graphs with up to 100 nodes. It provides a closed-form posterior in the linear case and a GP-based random Fourier feature approximation for nonlinear cases, coupled with a prior over parent sets derived from doubly robust causal feature selection. The approach leverages interventional data and do-calculus to refine both the surrogate model and the causal-parent posterior, demonstrating competitive results across synthetic, semi-synthetic, and real-world-like networks. The framework broadens the applicability of CBO in real-world domains where causal structure is incomplete or uncertain, with a clear path for future extensions to broader intervention types and partial discovery.

Abstract

Causal Bayesian Optimization (CBO) is a methodology designed to optimize an outcome variable by leveraging known causal relationships through targeted interventions. Traditional CBO methods require a fully and accurately specified causal graph, which is a limitation in many real-world scenarios where such graphs are unknown. To address this, we propose a new method for the CBO framework that operates without prior knowledge of the causal graph. Consistent with causal bandit theory, we demonstrate through theoretical analysis and that focusing on the direct causal parents of the target variable is sufficient for optimization, and provide empirical validation in the context of CBO. Furthermore we introduce a new method that learns a Bayesian posterior over the direct parents of the target variable. This allows us to optimize the outcome variable while simultaneously learning the causal structure. Our contributions include a derivation of the closed-form posterior distribution for the linear case. In the nonlinear case where the posterior is not tractable, we present a Gaussian Process (GP) approximation that still enables CBO by inferring the parents of the outcome variable. The proposed method performs competitively with existing benchmarks and scales well to larger graphs, making it a practical tool for real-world applications where causal information is incomplete.

Paper Structure

This paper contains 38 sections, 2 theorems, 34 equations, 22 figures, 1 algorithm.

Key Result

Theorem 4.4

The posterior probability of $\boldsymbol{X}_s$ being the set of parents of $Y$ for a sample $\{\boldsymbol{x}, y \}$ is where $\boldsymbol{\theta}_s \mid \boldsymbol{g}_s \sim \mathcal{N}(\boldsymbol{\mu}_{\text{prior}}, \Sigma_{\text{prior}})$, $C = \left( \Sigma_{\text{prior}}^{-1} + \frac{\boldsymbol{x}_s \boldsymbol{x}_s^\top}{\sigma^2_Y}\right)^{-1}$ and $\boldsymbol{b} = \left(\frac{y \bol

Figures (22)

  • Figure 1: An overview of the iterative process of the method where the interventions are used to update beliefs about the surrogate model and the direct parents.
  • Figure 2: Causal graph showing assumptions between variables for optimality of hard interventions.
  • Figure 3: Results on the $Y^* \downarrow$ and the $\Bar{Y} \downarrow$ metric across 10 randomly initialized $\mathcal{D}_{\text{obs}}$ and $\mathcal{D}_{\text{int}}$ for randomly generated nonlinear Erdos-Renyi graphs of size 10, 15, 20, 50, 100. Each algorithm was run for 50 trials. The top row shows the results for the $Y^*$ case and the bottom row shows the results for the $\Bar{Y}$ case.
  • Figure 4: This figure shows the proportion of times each algorithm correctly selected interventions that was a direct parent of the target across 10 different iterations of the algorithm for the nonlinear Erdos-Renyi graphs. CBO-U successfully identifies the optimal interventions the vast majority of the time.
  • Figure 5: Results on the $Y^* \downarrow$ and the $\Bar{Y} \downarrow$ metric across 10 randomly initialized $\mathcal{D}_{\text{obs}}$ and $\mathcal{D}_{\text{int}}$ for benchmark examples. Each algorithm was run for 30 trials. The top row shows the results for the $Y^*$ case and the bottom row shows the results for the $\Bar{Y}$ case. The dashed horizontal line shows the mean of CBO-U which performs better on average than CEO and BO.
  • ...and 17 more figures

Theorems & Definitions (2)

  • Theorem 4.4: Posterior update rule for the linear SCM
  • Proposition 1.1: Optimality of hard interventions