Table of Contents
Fetching ...

Observationally Informed Adaptive Causal Experimental Design

Erdun Gao, Liang Zhang, Jake Fawkes, Aoqi Zuo, Wenqin Liu, Haoxuan Li, Mingming Gong, Dino Sejdinovic

TL;DR

This work proposes Active Residual Learning, a new paradigm that leverages the observational model as a foundational prior, and introduces the R-Design framework, a unified criterion that directly targets the causal estimand, minimizing residual uncertainty for estimation or clarifying decision boundaries for policy.

Abstract

Randomized Controlled Trials (RCTs) represent the gold standard for causal inference yet remain a scarce resource. While large-scale observational data is often available, it is utilized only for retrospective fusion, and remains discarded in prospective trial design due to bias concerns. We argue this "tabula rasa" data acquisition strategy is fundamentally inefficient. In this work, we propose Active Residual Learning, a new paradigm that leverages the observational model as a foundational prior. This approach shifts the experimental focus from learning target causal quantities from scratch to efficiently estimating the residuals required to correct observational bias. To operationalize this, we introduce the R-Design framework. Theoretically, we establish two key advantages: (1) a structural efficiency gap, proving that estimating smooth residual contrasts admits strictly faster convergence rates than reconstructing full outcomes; and (2) information efficiency, where we quantify the redundancy in standard parameter-based acquisition (e.g., BALD), demonstrating that such baselines waste budget on task-irrelevant nuisance uncertainty. We propose R-EPIG (Residual Expected Predictive Information Gain), a unified criterion that directly targets the causal estimand, minimizing residual uncertainty for estimation or clarifying decision boundaries for policy. Experiments on synthetic and semi-synthetic benchmarks demonstrate that R-Design significantly outperforms baselines, confirming that repairing a biased model is far more efficient than learning one from scratch.

Observationally Informed Adaptive Causal Experimental Design

TL;DR

This work proposes Active Residual Learning, a new paradigm that leverages the observational model as a foundational prior, and introduces the R-Design framework, a unified criterion that directly targets the causal estimand, minimizing residual uncertainty for estimation or clarifying decision boundaries for policy.

Abstract

Randomized Controlled Trials (RCTs) represent the gold standard for causal inference yet remain a scarce resource. While large-scale observational data is often available, it is utilized only for retrospective fusion, and remains discarded in prospective trial design due to bias concerns. We argue this "tabula rasa" data acquisition strategy is fundamentally inefficient. In this work, we propose Active Residual Learning, a new paradigm that leverages the observational model as a foundational prior. This approach shifts the experimental focus from learning target causal quantities from scratch to efficiently estimating the residuals required to correct observational bias. To operationalize this, we introduce the R-Design framework. Theoretically, we establish two key advantages: (1) a structural efficiency gap, proving that estimating smooth residual contrasts admits strictly faster convergence rates than reconstructing full outcomes; and (2) information efficiency, where we quantify the redundancy in standard parameter-based acquisition (e.g., BALD), demonstrating that such baselines waste budget on task-irrelevant nuisance uncertainty. We propose R-EPIG (Residual Expected Predictive Information Gain), a unified criterion that directly targets the causal estimand, minimizing residual uncertainty for estimation or clarifying decision boundaries for policy. Experiments on synthetic and semi-synthetic benchmarks demonstrate that R-Design significantly outperforms baselines, confirming that repairing a biased model is far more efficient than learning one from scratch.
Paper Structure (119 sections, 4 theorems, 82 equations, 24 figures, 5 tables, 1 algorithm)

This paper contains 119 sections, 4 theorems, 82 equations, 24 figures, 5 tables, 1 algorithm.

Key Result

Lemma 1

Assume the observational base learner $\hat{\mu}_o$ and the experimental residual learner $\hat{\delta}$ employ priors that are optimally adapted to the smoothness of their respective target functions. Under standard assumptions (overlap and experimental unconfoundedness), the expected PEHE risk $\p where $C_1, C_2 > 0$ are domain-dependent constants and $\epsilon_{\textup{approx}}$ denotes the ir

Figures (24)

  • Figure 1: R-Design intuition on 1D synthetic data. The observational prior is biased (Left) but captures high-frequency structure (Middle-Left). Treating it as a fixed offset, R-Design learns a simpler residual function (Middle-Right), exploiting the complexity gap to recover the true CATE with few samples (Right), while the naive observational model fails due to uncorrected bias.
  • Figure 2: Causal diagrams of data-generating processes.
  • Figure 3: Comparison of acquisition functions using the average rank metric and relative performance improvements for effect estimation and policy learning tasks over eight combinations of base effect functions, bias functions, and CATE functions.
  • Figure 4: Comparison of $\sqrt{\text{PEHE}}$ on simulation dataset of two trial estimators (CMGP and NSGP). (1,3) CMGP and NSGP, (2,4) CMGP and NSGP with heavy covariate shift.
  • Figure 5: Performance comparison of all methods on policy learning task (a) APE (b) AR.
  • ...and 19 more figures

Theorems & Definitions (11)

  • Remark 1
  • Remark 2: Information Gain vs. Direct Error Reduction
  • Remark 3: The Intuition of Structural Efficiency
  • Lemma 1
  • Proposition 1: Objective Alignment
  • Proposition 2: Information Redundancy
  • Theorem 2: Uncertainty Convergence
  • Definition 1: Maximal Valid Design
  • Definition 2: Feasible Limit
  • proof
  • ...and 1 more