Table of Contents
Fetching ...

The Causal Abstraction Network: Theory and Learning

Gabriele D'Acunto, Paolo Di Lorenzo, Sergio Barbarossa

TL;DR

The work introduces the Causal Abstraction Network ($CAN$), a network sheaf model for causal knowledge across multiple subjective Gaussian SCMs where edge restriction maps are transposes of constructive linear causal abstractions (CLCA) and edge stalks align with node stalks up to rotation. It provides a category-theoretic formalization, derives algebraic invariants and a Laplacian $L$ to characterize CK flow, and links consistency to the semantic embedding principle (SEP). Learning CANs is cast as edge-local KL-minimization problems that are solved efficiently by a spectral method operating on Stiefel manifolds, with a transitive-composition-based search to prune candidate edges; this approach extends to positive semidefinite covariances. Empirical results on synthetic data show the spectral method achieves competitive performance with prior CA learning techniques and can recover CAN topologies even when global sections are provided, validating the framework for collaborative causal AI and multi-agent CK transfer. The work lays groundwork for diffusion of causal knowledge across a network of agents and points to future directions in topology adjustment, mixture-model CK, identifiability, and real-world validation.

Abstract

Causal artificial intelligence aims to enhance explainability, trustworthiness, and robustness in AI by leveraging structural causal models (SCMs). In this pursuit, recent advances formalize network sheaves of causal knowledge. Pushing in the same direction, we introduce the causal abstraction network (CAN), a specific instance of such sheaves where (i) SCMs are Gaussian, (ii) restriction maps are transposes of constructive linear causal abstractions (CAs), and (iii) edge stalks correspond -- up to rotation -- to the node stalks of more detailed SCMs. We investigate the theoretical properties of CAN, including algebraic invariants, cohomology, consistency, global sections characterized via the Laplacian kernel, and smoothness. We then tackle the learning of consistent CANs. Our problem formulation separates into edge-specific local Riemannian problems and avoids nonconvex, costly objectives. We propose an efficient search procedure as a solution, solving the local problems with SPECTRAL, our iterative method with closed-form updates and suitable for positive definite and semidefinite covariance matrices. Experiments on synthetic data show competitive performance in the CA learning task, and successful recovery of diverse CAN structures.

The Causal Abstraction Network: Theory and Learning

TL;DR

The work introduces the Causal Abstraction Network (), a network sheaf model for causal knowledge across multiple subjective Gaussian SCMs where edge restriction maps are transposes of constructive linear causal abstractions (CLCA) and edge stalks align with node stalks up to rotation. It provides a category-theoretic formalization, derives algebraic invariants and a Laplacian to characterize CK flow, and links consistency to the semantic embedding principle (SEP). Learning CANs is cast as edge-local KL-minimization problems that are solved efficiently by a spectral method operating on Stiefel manifolds, with a transitive-composition-based search to prune candidate edges; this approach extends to positive semidefinite covariances. Empirical results on synthetic data show the spectral method achieves competitive performance with prior CA learning techniques and can recover CAN topologies even when global sections are provided, validating the framework for collaborative causal AI and multi-agent CK transfer. The work lays groundwork for diffusion of causal knowledge across a network of agents and points to future directions in topology adjustment, mixture-model CK, identifiability, and real-world validation.

Abstract

Causal artificial intelligence aims to enhance explainability, trustworthiness, and robustness in AI by leveraging structural causal models (SCMs). In this pursuit, recent advances formalize network sheaves of causal knowledge. Pushing in the same direction, we introduce the causal abstraction network (CAN), a specific instance of such sheaves where (i) SCMs are Gaussian, (ii) restriction maps are transposes of constructive linear causal abstractions (CAs), and (iii) edge stalks correspond -- up to rotation -- to the node stalks of more detailed SCMs. We investigate the theoretical properties of CAN, including algebraic invariants, cohomology, consistency, global sections characterized via the Laplacian kernel, and smoothness. We then tackle the learning of consistent CANs. Our problem formulation separates into edge-specific local Riemannian problems and avoids nonconvex, costly objectives. We propose an efficient search procedure as a solution, solving the local problems with SPECTRAL, our iterative method with closed-form updates and suitable for positive definite and semidefinite covariance matrices. Experiments on synthetic data show competitive performance in the CA learning task, and successful recovery of diverse CAN structures.

Paper Structure

This paper contains 12 sections, 10 theorems, 53 equations, 4 figures.

Key Result

Theorem 2.4

Let $\chi^{\ell}\xspace \sim N(\boldsymbol{0}\xspace_\ell, \boldsymbol{\Sigma}^{\ell}\xspace)$, $\chi^{h}\xspace \sim N(\boldsymbol{0}\xspace_h, \boldsymbol{\Sigma}^{h}\xspace)$, where $\boldsymbol{\Sigma}^{\ell}\xspace \in \mathcal{S}_{++}\xspace^\ell$ and $\boldsymbol{\Sigma}^{h}\xspace \in \mathc

Figures (4)

  • Figure 1: (a) Causal abstraction network $\mathbb{G}$ made of $4$ nodes and $3$ undirected edges given in black. Blue arcs follow the network orientation, corresponding to the embedding direction, that is, the action of the functor $E$. Purple arcs follow the abstraction direction, that is, the action of the functor $A$. (b) Network sheaf representation corresponding to $\mathbb{G}$. Each edge (co)stalk coincides--up to rotation--with the node (co)stalk of the coarser causal model.
  • Figure 2: Synthetic results for the solution of the local problem across all settings $(\ell, h)$ from d2025causal.
  • Figure 3: CANs used in the empirical evaluation, shown with. Nodes $1$ and $10$ are the finest and coarsest SCMs, respectively. Edges in the transitive reduction are drawn in light blue; whereas edges appearing only in the transitive closure are dashed dark blue. Panels correspond to (a) chain, (b) star, and (c) tree transitive reductions.
  • Figure 4: False positive (left) and true positive (right) rates for the proposed method on the three CANs in \ref{['fig:graphs']}.

Theorems & Definitions (33)

  • Definition 2.1: $\boldsymbol{\alpha}$-abstraction rischel2020category
  • Definition 2.2: Semantic embedding principle d2025causal
  • Definition 2.3: Semantic embedding principle, CLCA d2025causal
  • Theorem 2.4: From d2025causal
  • Definition 2.5: Network sheaf of CK d2025relativity
  • Definition 2.6: Network cosheaf of CK d2025relativity
  • Definition 2.7: Global section
  • Lemma 3.1
  • proof
  • Definition 3.2: Abstraction functor
  • ...and 23 more