Table of Contents
Fetching ...

Black Box Causal Inference: Effect Estimation via Meta Prediction

Lucius E. J. Bynum, Aahlad Manas Puli, Diego Herrero-Quevedo, Nhi Nguyen, Carlos Fernandez-Granda, Kyunghyun Cho, Rajesh Ranganath

TL;DR

This work introduces Black Box Causal Inference (BBCI), a meta-learning framework that treats causal effect estimation as a dataset-level prediction problem. By sampling a family of structural causal models and generating dataset–estimand pairs, BBCI trains a neural predictor to map any observed dataset to a target estimand such as the average treatment effect or conditional average treatment effect, bypassing problem-specific estimator design. The authors provide a theoretical error decomposition and demonstrate that BBCI matches or surpasses dedicated estimators across confounding, instrumental variables, and proximal causal inference settings, including mixed or real-unknown identification. The results on semi-synthetic and real data illustrate BBCI’s versatility, robustness to weak instruments and limited data, and potential to scale to new causal-identification regimes with minimal manual derivation. The work points to future directions in higher-dimensional covariates and uncertainty quantification, positioning BBCI as a practical, generalizable tool for causal effect estimation.

Abstract

Causal inference and the estimation of causal effects plays a central role in decision-making across many areas, including healthcare and economics. Estimating causal effects typically requires an estimator that is tailored to each problem of interest. But developing estimators can take significant effort for even a single causal inference setting. For example, algorithms for regression-based estimators, propensity score methods, and doubly robust methods were designed across several decades to handle causal estimation with observed confounders. Similarly, several estimators have been developed to exploit instrumental variables (IVs), including two-stage least-squares (TSLS), control functions, and the method-of-moments. In this work, we instead frame causal inference as a dataset-level prediction problem, offloading algorithm design to the learning process. The approach we introduce, called black box causal inference (BBCI), builds estimators in a black-box manner by learning to predict causal effects from sampled dataset-effect pairs. We demonstrate accurate estimation of average treatment effects (ATEs) and conditional average treatment effects (CATEs) with BBCI across several causal inference problems with known identification, including problems with less developed estimators.

Black Box Causal Inference: Effect Estimation via Meta Prediction

TL;DR

This work introduces Black Box Causal Inference (BBCI), a meta-learning framework that treats causal effect estimation as a dataset-level prediction problem. By sampling a family of structural causal models and generating dataset–estimand pairs, BBCI trains a neural predictor to map any observed dataset to a target estimand such as the average treatment effect or conditional average treatment effect, bypassing problem-specific estimator design. The authors provide a theoretical error decomposition and demonstrate that BBCI matches or surpasses dedicated estimators across confounding, instrumental variables, and proximal causal inference settings, including mixed or real-unknown identification. The results on semi-synthetic and real data illustrate BBCI’s versatility, robustness to weak instruments and limited data, and potential to scale to new causal-identification regimes with minimal manual derivation. The work points to future directions in higher-dimensional covariates and uncertainty quantification, positioning BBCI as a practical, generalizable tool for causal effect estimation.

Abstract

Causal inference and the estimation of causal effects plays a central role in decision-making across many areas, including healthcare and economics. Estimating causal effects typically requires an estimator that is tailored to each problem of interest. But developing estimators can take significant effort for even a single causal inference setting. For example, algorithms for regression-based estimators, propensity score methods, and doubly robust methods were designed across several decades to handle causal estimation with observed confounders. Similarly, several estimators have been developed to exploit instrumental variables (IVs), including two-stage least-squares (TSLS), control functions, and the method-of-moments. In this work, we instead frame causal inference as a dataset-level prediction problem, offloading algorithm design to the learning process. The approach we introduce, called black box causal inference (BBCI), builds estimators in a black-box manner by learning to predict causal effects from sampled dataset-effect pairs. We demonstrate accurate estimation of average treatment effects (ATEs) and conditional average treatment effects (CATEs) with BBCI across several causal inference problems with known identification, including problems with less developed estimators.

Paper Structure

This paper contains 34 sections, 2 theorems, 16 equations, 5 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Let $\phi(S)$ be the true target quantity of interest under a given SCM $S \sim \mathcal{F}$. $\tilde{\phi}(S)$ approximates $\phi(S)$ under a finite resource constraint, such as a finite number of samples for Monte Carlo estimation. Assume $\phi(S) = \tilde{\phi}(S) + \epsilon$, where $\mathbb{E}[\

Figures (5)

  • Figure 1: Visual depiction of per-dataset algorithm development (left) compared to (right).
  • Figure 2: Example dag for several settings with known identification of the effect of $\tilde{t}$ on $\tilde{y}$, including: (a) the confounding case with observed confounder $\tilde{x}$; (b) the case with instrument $\tilde{u}_t$, where $\tilde{x}$ is unobserved; and (c) a proximal causal inference case with two proxies, $\tilde{w}_1$ and $\tilde{w}_2$.
  • Figure 3: Predicted values from the two approaches for various strengths of the instrument on datasets of size $100$. Instrument strength $\rho_{{\boldsymbol{\mathbf{\epsilon}}}_t}$ is measured as absolute correlation between the instrument and treatment. performs as well or better than regardless of instrument strength. It is strictly better when the is weak, where estimates can vary wildly.
  • Figure 4: estimates for various with binary treatment and nonlinear outcomes in the confounding case, where the target is normalized by the observed outcome's 95% and 5% quantiles. The combined -sampler samples from all 3 response surfaces with equal probability.
  • Figure 5: estimates for LaLonde data Lalonde1984EvaluatingTE, comparing randomized estimates of the to observational estimates of the across different source dataset sizes: the original study size $N=445$ as well as two smaller subsets, $N=200$ and $N=100$.

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 1
  • proof