The Causal Abstraction Network: Theory and Learning
Gabriele D'Acunto, Paolo Di Lorenzo, Sergio Barbarossa
TL;DR
The work introduces the Causal Abstraction Network ($CAN$), a network sheaf model for causal knowledge across multiple subjective Gaussian SCMs where edge restriction maps are transposes of constructive linear causal abstractions (CLCA) and edge stalks align with node stalks up to rotation. It provides a category-theoretic formalization, derives algebraic invariants and a Laplacian $L$ to characterize CK flow, and links consistency to the semantic embedding principle (SEP). Learning CANs is cast as edge-local KL-minimization problems that are solved efficiently by a spectral method operating on Stiefel manifolds, with a transitive-composition-based search to prune candidate edges; this approach extends to positive semidefinite covariances. Empirical results on synthetic data show the spectral method achieves competitive performance with prior CA learning techniques and can recover CAN topologies even when global sections are provided, validating the framework for collaborative causal AI and multi-agent CK transfer. The work lays groundwork for diffusion of causal knowledge across a network of agents and points to future directions in topology adjustment, mixture-model CK, identifiability, and real-world validation.
Abstract
Causal artificial intelligence aims to enhance explainability, trustworthiness, and robustness in AI by leveraging structural causal models (SCMs). In this pursuit, recent advances formalize network sheaves of causal knowledge. Pushing in the same direction, we introduce the causal abstraction network (CAN), a specific instance of such sheaves where (i) SCMs are Gaussian, (ii) restriction maps are transposes of constructive linear causal abstractions (CAs), and (iii) edge stalks correspond -- up to rotation -- to the node stalks of more detailed SCMs. We investigate the theoretical properties of CAN, including algebraic invariants, cohomology, consistency, global sections characterized via the Laplacian kernel, and smoothness. We then tackle the learning of consistent CANs. Our problem formulation separates into edge-specific local Riemannian problems and avoids nonconvex, costly objectives. We propose an efficient search procedure as a solution, solving the local problems with SPECTRAL, our iterative method with closed-form updates and suitable for positive definite and semidefinite covariance matrices. Experiments on synthetic data show competitive performance in the CA learning task, and successful recovery of diverse CAN structures.
