Table of Contents
Fetching ...

Counterfactual Realizability

Arvind Raghavan, Elias Bareinboim

TL;DR

Counterfactual Realizability introduces a formal notion of realizability for Layer-$\mathcal{L}_3$ counterfactuals and a complete algorithm, CTF-REALIZE, to decide whether a given counterfactual distribution can be physically sampled under fundamental constraints. By extending Fisherian experimentation with counterfactual randomization and counterfactual mediators, the authors show how certain $\mathcal{L}_3$-quantities can be directly estimated, and provide a graphical criterion (counterfactual-ancestor condition) and maximal action set $\mathbb{A}^\dag(\mathcal{G})$ for realizability. They prove the approach is complete and demonstrate its practical impact in causal fairness and causal RL, where counterfactual strategies can dominate traditional $\mathcal{L}_1$ and $\mathcal{L}_2$ baselines. The work lays a foundation for novel experiment designs that enable direct estimation of otherwise nonidentifiable quantities, with implications for personalized decision-making and interpretability. The framework also clarifies limitations and suggests directions for integrating partial identification and sequential decision-making under known causal structure.

Abstract

It is commonly believed that, in a real-world environment, samples can only be drawn from observational and interventional distributions, corresponding to Layers 1 and 2 of the Pearl Causal Hierarchy. Layer 3, representing counterfactual distributions, is believed to be inaccessible by definition. However, Bareinboim, Forney, and Pearl (2015) introduced a procedure that allows an agent to sample directly from a counterfactual distribution, leaving open the question of what other counterfactual quantities can be estimated directly via physical experimentation. We resolve this by introducing a formal definition of realizability, the ability to draw samples from a distribution, and then developing a complete algorithm to determine whether an arbitrary counterfactual distribution is realizable given fundamental physical constraints, such as the inability to go back in time and subject the same unit to a different experimental condition. We illustrate the implications of this new framework for counterfactual data collection using motivating examples from causal fairness and causal reinforcement learning. While the baseline approach in these motivating settings typically follows an interventional or observational strategy, we show that a counterfactual strategy provably dominates both.

Counterfactual Realizability

TL;DR

Counterfactual Realizability introduces a formal notion of realizability for Layer- counterfactuals and a complete algorithm, CTF-REALIZE, to decide whether a given counterfactual distribution can be physically sampled under fundamental constraints. By extending Fisherian experimentation with counterfactual randomization and counterfactual mediators, the authors show how certain -quantities can be directly estimated, and provide a graphical criterion (counterfactual-ancestor condition) and maximal action set for realizability. They prove the approach is complete and demonstrate its practical impact in causal fairness and causal RL, where counterfactual strategies can dominate traditional and baselines. The work lays a foundation for novel experiment designs that enable direct estimation of otherwise nonidentifiable quantities, with implications for personalized decision-making and interpretability. The framework also clarifies limitations and suggests directions for integrating partial identification and sequential decision-making under known causal structure.

Abstract

It is commonly believed that, in a real-world environment, samples can only be drawn from observational and interventional distributions, corresponding to Layers 1 and 2 of the Pearl Causal Hierarchy. Layer 3, representing counterfactual distributions, is believed to be inaccessible by definition. However, Bareinboim, Forney, and Pearl (2015) introduced a procedure that allows an agent to sample directly from a counterfactual distribution, leaving open the question of what other counterfactual quantities can be estimated directly via physical experimentation. We resolve this by introducing a formal definition of realizability, the ability to draw samples from a distribution, and then developing a complete algorithm to determine whether an arbitrary counterfactual distribution is realizable given fundamental physical constraints, such as the inability to go back in time and subject the same unit to a different experimental condition. We illustrate the implications of this new framework for counterfactual data collection using motivating examples from causal fairness and causal reinforcement learning. While the baseline approach in these motivating settings typically follows an interventional or observational strategy, we show that a counterfactual strategy provably dominates both.

Paper Structure

This paper contains 44 sections, 16 theorems, 30 equations, 23 figures, 2 tables, 6 algorithms.

Key Result

Theorem 3.5

An $\mathcal{L}_3$-distribution $Q = P(\mathbf{W}_\star)$ is realizable given action set $\mathbb{A}$ and causal diagram $\mathcal{G}$ iff the algorithm CTF-REALIZE($Q, \mathcal{G}, \mathbb{A}$) returns a sample. $\blacksquare$

Figures (23)

  • Figure 1: It is commonly assumed an agent can sample only from $\mathcal{L}_1$ and $\mathcal{L}_2$ distributions in the real world.
  • Figure 2: (Top) Causal diagram with decision variable $X$; (Middle) Fisherian randomization of overriding the unit's natural decision $X$ and assigning a fixed value; (Bottom) Randomizing the actual decision affecting $Y$ without erasing the unit's natural decision $X$.
  • Figure 3: (a) "Expanded" diagram for Example 1, where $W$ is counterfactual mediator for $X$; (b) Randomizing the value of $X$ as perceived by $Y$.
  • Figure 4: Testing realizability of $P(Z_x, W_t)$ for $\mathcal{G}_1$ (left) and $\mathcal{G}_2$ (right). $\mathcal{G}_1$ yields conflicting requirements.
  • Figure 5: Pearl Causal Hierarchy (PCH) induced by an unknown SCM $\mathcal{M}$. An $\mathcal{L}_3$-distribution is realizable given a graph $\mathcal{G}$ and the maximal feasible action set $\mathbb{A}^\dag(\mathcal{G})$ iff the ancestor set $An(\mathbf{W}_\star)$ does not contain the same variable under different regimes.
  • ...and 18 more figures

Theorems & Definitions (48)

  • Definition 2.1: Physical actions
  • Definition 2.2: Counterfactual mediator (informal)
  • Definition 2.3: Counterfactual (ctf-) randomization
  • Remark 3.2
  • Definition 3.3: I.i.d sample
  • Definition 3.4: Realizability
  • Theorem 3.5: Correctness and Completeness
  • Definition 3.6: Ancestors of a counterfactual correaetal:21
  • Corollary 3.7
  • Corollary 3.8: Fundamental problem of causal inference (FPCI) holland:86
  • ...and 38 more