DeCaFlow: A deconfounding causal generative model

Alejandro Almodóvar; Adrián Javaloy; Juan Parras; Santiago Zazo; Isabel Valera

DeCaFlow: A deconfounding causal generative model

Alejandro Almodóvar, Adrián Javaloy, Juan Parras, Santiago Zazo, Isabel Valera

TL;DR

DeCaFlow presents a scalable, end-to-end causal generative framework that extends causal normalizing flows to handle hidden confounding via a variational encoder and proxy variables. It provides correct estimates for a broad class of interventional and counterfactual queries identifiable through do-calculus and proximal identifiability, using a single training per dataset. The approach demonstrates strong empirical performance on semi-synthetic Sachs and Ecoli70 graphs and a real-world fairness use case, outperforming several baselines and matching oracle-like performance on identifiable queries. By integrating structural causal constraints, proxy informativeness, and a principled training objective, DeCaFlow offers a practical tool for causal inference in complex, confounded settings with continuous variables. The work also discusses limitations related to proxy quality and graph correctness and outlines directions for future extensions to time-varying treatments and broader applicability.

Abstract

We introduce DeCaFlow, a deconfounding causal generative model. Training once per dataset using just observational data and the underlying causal graph, DeCaFlow enables accurate causal inference on continuous variables under the presence of hidden confounders. Specifically, we extend previous results on causal estimation under hidden confounding to show that a single instance of DeCaFlow provides correct estimates for all causal queries identifiable with do-calculus, leveraging proxy variables to adjust for the causal effects when do-calculus alone is insufficient. Moreover, we show that counterfactual queries are identifiable as long as their interventional counterparts are identifiable, and thus are also correctly estimated by DeCaFlow. Our empirical results on diverse settings (including the Ecoli70 dataset, with 3 independent hidden confounders, tens of observed variables and hundreds of causal queries) show that DeCaFlow outperforms existing approaches, while demonstrating its out-of-the-box applicability to any given causal graph. An implementation can be found in https://github.com/aalmodovares/DeCaFlow

DeCaFlow: A deconfounding causal generative model

TL;DR

Abstract

DeCaFlow: A deconfounding causal generative model

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (31)

Theorems & Definitions (32)