A Fixed-Point Approach for Causal Generative Modeling

Meyer Scetbon; Joel Jennings; Agrin Hilmkil; Cheng Zhang; Chao Ma

A Fixed-Point Approach for Causal Generative Modeling

Meyer Scetbon, Joel Jennings, Agrin Hilmkil, Cheng Zhang, Chao Ma

TL;DR

This work introduces a fixed-point causal modeling framework that removes the reliance on DAGs by formulating SCMs as fixed-point equations on causally ordered variables. It proposes FiP, a two-stage pipeline that first learns a zero-shot topological ordering via amortized leaf-prediction, then learns a fixed-point SCM on the ordered variables using a transformer-based architecture designed to preserve causal structure. The authors establish equivalence and identifiability results for FP-SCMs under additive-noise and monotone variants, and demonstrate strong performance on causal discovery and counterfactual tasks, particularly on out-of-distribution datasets. The combined approach offers a scalable, DAG-free pathway for causal learning and generation, with implications for robust cross-domain causal understanding and potential foundation-model-like extensions.

Abstract

We propose a novel formalism for describing Structural Causal Models (SCMs) as fixed-point problems on causally ordered variables, eliminating the need for Directed Acyclic Graphs (DAGs), and establish the weakest known conditions for their unique recovery given the topological ordering (TO). Based on this, we design a two-stage causal generative model that first infers in a zero-shot manner a valid TO from observations, and then learns the generative SCM on the ordered variables. To infer TOs, we propose to amortize the learning of TOs on synthetically generated datasets by sequentially predicting the leaves of graphs seen during training. To learn SCMs, we design a transformer-based architecture that exploits a new attention mechanism enabling the modeling of causal structures, and show that this parameterization is consistent with our formalism. Finally, we conduct an extensive evaluation of each method individually, and show that when combined, our model outperforms various baselines on generated out-of-distribution problems. The code is available on \href{https://github.com/microsoft/causica/tree/main/research_experiments/fip}{Github}.

A Fixed-Point Approach for Causal Generative Modeling

TL;DR

Abstract

Paper Structure (46 sections, 29 theorems, 104 equations, 3 figures, 25 tables, 5 algorithms)

This paper contains 46 sections, 29 theorems, 104 equations, 3 figures, 25 tables, 5 algorithms.

Introduction
Related Work
Fixed-Point Formulation of SCMs
Standard SCMs
Fixed-Point SCMs Without DAGs
Partial Recovery of Fixed-Point SCMs
Weak Partial Recovery of Fixed-Points SCMs
Amortized Learning of TO
Fixed-Point SCM Learning
Proposed Architecture
Training and Generation
Experiments
Evaluation of $\mathcal{M}$
Evaluation of $\mathcal{T}_{\text{ANM}}$
Full Pipeline Benchmarking
...and 31 more sections

Key Result

Proposition 2.4

Let $\mathcal{S}_{\text{fp}}(P, \mathbb{P}_{\bm{N}}, H)$ a fixed-point SCM as defined in definition def:fp-scm. Then the fixed-point problem eq:dfp_scm on $\gamma\in \Pi_{2,\mathbb{P}_{\bm{N}}}$ admits a unique solution.

Figures (3)

Figure 1: We compare the performances of three models $\mathcal{M}$ trained on datasets of $n_{\text{train}}=200$ samples in $d_{\text{train}}=20$, but with different value for $d_{\text{max}}$. We measure their $\text{TOS}$ on the aggregation of both O.O.D metadatasets $\text{LIN}~\textbf{OUT}$ and $\text{RFF}~\textbf{OUT}$ for $d_{\text{test}}\in\{10,20,50\}$ and show as well the standard deviations. Note that we test on larger instance problems than seen during training when $d_{\text{test}}=50$.
Figure 2: For each of the four test metadatasets, we plot the $\text{TOS}$ obtained against the dimension $d_{\text{test}}$. For each curve, each point is obtained by averaging over all the test datasets of a given dimension. We also show the standard deviations.
Figure 3: We compare the F1 scores obtained by learning $\mathcal{T}_{\text{ANM}}$ with either the full graph or the TO on various settings. These score are obtained by comparing $\hat{\mathcal{G}}(0.1)$ with the ground truth graph and we show the averaged score over all instances of a given setting with their standard deviations.

Theorems & Definitions (69)

Remark 2.1
Definition 2.3: Fixed-Point SCM
Proposition 2.4
Definition 2.5: Causal Graph of Fixed-Point SCM
Remark 2.6
Proposition 2.7
Proposition 2.8
Theorem 2.9
Remark 2.10
Remark 2.12
...and 59 more

A Fixed-Point Approach for Causal Generative Modeling

TL;DR

Abstract

A Fixed-Point Approach for Causal Generative Modeling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (69)