Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model

Valentyn Melnychuk; Dennis Frauen; Stefan Feuerriegel

Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model

Valentyn Melnychuk, Dennis Frauen, Stefan Feuerriegel

TL;DR

This work tackles partial counterfactual identification for continuous outcomes within Markovian SCMs, where point identification is generally impossible without strong assumptions. It shows that, under mild relaxations (arbitrary latent-dimensional noise and non-monotone functions), the ignorance bounds for the expected counterfactual outcome ECOU $Q^{\mathcal{M}}_{a'\to a}(y')$ are non-informative. To recover informative bounds, the authors introduce the Curvature Sensitivity Model (CSM), which constrains the curvature of level-set manifolds of the SCM functions via a bound $\kappa$ on principal curvatures; increasing curvature allows tighter bounds, while $\kappa=0$ corresponds to BGMs (identifiable in a restricted sense). They instantiate the approach with Augmented Pseudo-Invertible Decoder (APID), a deep generative model built from residual normalizing flows and variational augmentations that supports abduction-action-prediction and curvature penalization during training. Empirical results on synthetic data and a COVID-19 case study illustrate that APID can yield informative partial counterfactual bounds and demonstrate practical applicability for decision-making in safety-critical settings. This work thus provides a first partial identification framework for continuous outcomes in Markovian SCMs, leveraging curvature constraints to bridge theory and scalable inference.

Abstract

Counterfactual inference aims to answer retrospective "what if" questions and thus belongs to the most fine-grained type of inference in Pearl's causality ladder. Existing methods for counterfactual inference with continuous outcomes aim at point identification and thus make strong and unnatural assumptions about the underlying structural causal model. In this paper, we relax these assumptions and aim at partial counterfactual identification of continuous outcomes, i.e., when the counterfactual query resides in an ignorance interval with informative bounds. We prove that, in general, the ignorance interval of the counterfactual queries has non-informative bounds, already when functions of structural causal models are continuously differentiable. As a remedy, we propose a novel sensitivity model called Curvature Sensitivity Model. This allows us to obtain informative bounds by bounding the curvature of level sets of the functions. We further show that existing point counterfactual identification methods are special cases of our Curvature Sensitivity Model when the bound of the curvature is set to zero. We then propose an implementation of our Curvature Sensitivity Model in the form of a novel deep generative model, which we call Augmented Pseudo-Invertible Decoder. Our implementation employs (i) residual normalizing flows with (ii) variational augmentations. We empirically demonstrate the effectiveness of our Augmented Pseudo-Invertible Decoder. To the best of our knowledge, ours is the first partial identification model for Markovian structural causal models with continuous outcomes.

Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model

TL;DR

are non-informative. To recover informative bounds, the authors introduce the Curvature Sensitivity Model (CSM), which constrains the curvature of level-set manifolds of the SCM functions via a bound

on principal curvatures; increasing curvature allows tighter bounds, while

corresponds to BGMs (identifiable in a restricted sense). They instantiate the approach with Augmented Pseudo-Invertible Decoder (APID), a deep generative model built from residual normalizing flows and variational augmentations that supports abduction-action-prediction and curvature penalization during training. Empirical results on synthetic data and a COVID-19 case study illustrate that APID can yield informative partial counterfactual bounds and demonstrate practical applicability for decision-making in safety-critical settings. This work thus provides a first partial identification framework for continuous outcomes in Markovian SCMs, leveraging curvature constraints to bridge theory and scalable inference.

Abstract

Paper Structure (31 sections, 15 theorems, 81 equations, 16 figures, 2 tables, 1 algorithm)

This paper contains 31 sections, 15 theorems, 81 equations, 16 figures, 2 tables, 1 algorithm.

Introduction
Related Work
Partial Counterfactual Identification of Continuous Outcomes
Preliminaries
Counterfactual Non-Identifiability
Partial Counterfactual Identification and Non-Informative Bounds
Curvature Sensitivity Model
Augmented Pseudo-Invertible Decoder
Experiments
Discussion
Extended Related Work
Counterfactual inference
Identifiability of latent variable models and disentanglement
Background materials
Examples
...and 16 more sections

Key Result

Lemma 1

Let $\mathcal{M} \in \mathfrak{B}(C^1, d)$. Then, the density of the observational distribution, induced by $\mathcal{M}$, is where $E(y,a)$ is a level set (preimage) of $y$, i. e., $E(y,a) = \{u_Y \in [0, 1]^d: f_Y(a, u_Y) = y\}$, and $\mathcal{H}^{d-1}(u_Y)$ is the Hausdorff measure (see Appendix app:preliminaries for the definition).

Figures (16)

Figure 1: Pearl's ladder of causation halpern2005causespearl2009causalitybareinboim2022pearl comparing observational, interventional, and counterfactual queries corresponding to the SCM $\mathcal{M}$ with two observed variables, i. e., binary treatment $A \in \{0, 1\}$ and continuous outcome $Y \in \mathbb{R}$. We also plot three causal diagrams, $\mathcal{G}(\mathcal{M})$, corresponding to each layer of causation, namely, Bayesian network, causal Bayesian network, and parallel worlds network. Queries with gray background can be simplified, i. e., be expressed via lower-layer distributions. The estimation of the queries with yellow background requires additional assumptions or distributions from the same layer. In this paper, we focus on partial identification of the expected counterfactual outcome of [un]treated, $\mathbb{E}(Y_a \mid a', y'), a' \neq a$, shown in orange.
Figure 2: Flow chart of identifiability for a counterfactual query.
Figure 3: Inference of observational and interventional distributions (left) and counterfactual distributions (right) for SCMs $\mathcal{M}_1$ and $\mathcal{M}_2$ from Example \ref{['exmpl:markov-non-id']}. Left: the observational query $\mathbb{P}^\mathcal{M}(Y = y\mid a)$ coincides with the interventional query $\mathbb{P}^\mathcal{M}(Y_a = y)$ for each $\mathcal{M}_1$ and $\mathcal{M}_2$. Right: the counterfactual queries can still differ substantially for $\mathcal{M}_1$ and $\mathcal{M}_2$, thus giving a vastly different counterfactual outcome distribution of the untreated, $\mathbb{P}^\mathcal{M}(Y_{1} \mid A'=0, Y'=0)$. Thus, $\mathcal{L}_3$ queries are non-identifiable from $\mathcal{L}_1$ or $\mathcal{L}_2$ information.
Figure 4: "Bending" the bundle of counterfactual level sets $\{E(y, a): y \in [1, 2]\}$ in blue around the factual level set $E(y', a')$ in orange.
Figure 5: Venn diagram of different SCM classes $\mathfrak{B}(C^k, d)$ arranged by partial vs. point point identification. See our CSM with different $\kappa$. We further show the relationships between different classes of SCMs $\mathfrak{B}(C^k, d)$ and statements about them, e. g., Lemmas \ref{['lemma:obs-push']} and \ref{['lemma:counter-push']}. The referenced examples are in the Appendix \ref{['app:examples']}.
...and 11 more figures

Theorems & Definitions (25)

Definition 1: Bivariate Markovian SCMs of class $C^k$ and $d$-dimension latent noise
Example 1: Counterfactual non-identifiability in Markovian SCMs
Lemma 1: Observational distribution as a pushforward with $f_Y$
Lemma 2
Definition 2: Partial identification of ECOU (ECOT) in class $\mathfrak{B}(C^k, d), k \geq 1$
Theorem 1: Non-informative bounds of ECOU (ECOT)
Theorem 2: Informative bounds with our
Lemma 3: BGMs-EQTDs identification gap of CSM($\kappa = 0$)
Example 2: Box-Müller transformation
Example 3: Connected components of the factual level sets
...and 15 more

Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model

TL;DR

Abstract

Partial Counterfactual Identification of Continuous Outcomes with a Curvature Sensitivity Model

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (16)

Theorems & Definitions (25)