Table of Contents
Fetching ...

ABC3: Active Bayesian Causal Inference with Cohn Criteria in Randomized Experiments

Taehun Cha, Donghun Lee

TL;DR

This work tackles the challenge of efficiently designing randomized causal experiments by treating subject and treatment selection as a Bayesian active-learning problem. ABC3 uses Gaussian processes to minimize the integrated posterior variance, aligning with the Cohn criteria, and proves that this variance-minimizing policy also reduces both treatment-control imbalance and the probability of Type I error. Theoretical results connect CATE estimation error to uncertainty measures and MMD bounds, while empirical results on real-world datasets show ABC3 achieves higher sampling efficiency and robustness against hyperparameter choices. The approach offers a principled, uncertainty-aware framework for selective data acquisition in causal inference with practical implications for costly experiments.

Abstract

In causal inference, randomized experiment is a de facto method to overcome various theoretical issues in observational study. However, the experimental design requires expensive costs, so an efficient experimental design is necessary. We propose ABC3, a Bayesian active learning policy for causal inference. We show a policy minimizing an estimation error on conditional average treatment effect is equivalent to minimizing an integrated posterior variance, similar to Cohn criteria \citep{cohn1994active}. We theoretically prove ABC3 also minimizes an imbalance between the treatment and control groups and the type 1 error probability. Imbalance-minimizing characteristic is especially notable as several works have emphasized the importance of achieving balance. Through extensive experiments on real-world data sets, ABC3 achieves the highest efficiency, while empirically showing the theoretical results hold.

ABC3: Active Bayesian Causal Inference with Cohn Criteria in Randomized Experiments

TL;DR

This work tackles the challenge of efficiently designing randomized causal experiments by treating subject and treatment selection as a Bayesian active-learning problem. ABC3 uses Gaussian processes to minimize the integrated posterior variance, aligning with the Cohn criteria, and proves that this variance-minimizing policy also reduces both treatment-control imbalance and the probability of Type I error. Theoretical results connect CATE estimation error to uncertainty measures and MMD bounds, while empirical results on real-world datasets show ABC3 achieves higher sampling efficiency and robustness against hyperparameter choices. The approach offers a principled, uncertainty-aware framework for selective data acquisition in causal inference with practical implications for costly experiments.

Abstract

In causal inference, randomized experiment is a de facto method to overcome various theoretical issues in observational study. However, the experimental design requires expensive costs, so an efficient experimental design is necessary. We propose ABC3, a Bayesian active learning policy for causal inference. We show a policy minimizing an estimation error on conditional average treatment effect is equivalent to minimizing an integrated posterior variance, similar to Cohn criteria \citep{cohn1994active}. We theoretically prove ABC3 also minimizes an imbalance between the treatment and control groups and the type 1 error probability. Imbalance-minimizing characteristic is especially notable as several works have emphasized the importance of achieving balance. Through extensive experiments on real-world data sets, ABC3 achieves the highest efficiency, while empirically showing the theoretical results hold.

Paper Structure

This paper contains 17 sections, 4 theorems, 11 equations, 6 figures, 1 table, 1 algorithm.

Key Result

Theorem 4.1

Assume $|k(x, x')| < \infty$ and $|y^a_i| < \infty$ for all $x, x' \in \mathcal{X}$ and $a, i$, as a result, $\epsilon^{\Omega}_{PEHE}(\hat{CATE}_t) < \infty, \forall t$. Let our estimator $\hat{CATE}_t(x) = \hat{y}^1_t(x) - \hat{y}^0_t(x)$, where $\hat{y}^a_t(x)=\mathbb{E}_t\left[ Y^a(x) \right]$ i

Figures (6)

  • Figure 1: Mean of $\epsilon_{PEHE}$. $x$-axis represents the observed percentage of the population. We measure $\epsilon_{PEHE}$ for every 10% observation.
  • Figure 2: Mean and standard deviation of MMD between observed treatment and control groups, $\mathbb{P}^1_t$ and $\mathbb{P}^0_t$. The blue line is for ABC3, and the orange line is for Naive policy. $x$-axis is for the sampled ratio and $y$-axis is MMD.
  • Figure 3: Mean of the type 1 error rate.
  • Figure 4: $\epsilon_{PEHE}$ for every kernel and kernel parameter. The numbers in parenthesis are kernel parameters.
  • Figure 5: $\epsilon_{PEHE}$ for different $\sigma^0_{\epsilon}:\sigma^1_{\epsilon}$.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Definition 3.1
  • Theorem 4.1
  • Proposition 4.2
  • Definition 4.3
  • Remark 4.4
  • Theorem 4.5
  • Definition 4.6
  • Theorem 4.7