Targeted Reduction of Causal Models
Armin Kekić, Bernhard Schölkopf, Michel Besserve
TL;DR
The paper addresses the challenge of explaining complex, high-dimensional simulations by learning concise, high-level causal explanations for a chosen target variable $Y$. It introduces Targeted Causal Reduction (TCR), which casts a low-level simulator as a causal model and learns a constructive, low-to-high-level transformation that yields a small set of high-level causes $\mathbf{Z}$ and a simple mechanism for $Y$, optimized via a KL-based consistency loss over shift interventions: $\mathcal{L}_{\mathrm{cons}}$. Under linear-Gaussian assumptions, the authors derive analytic identifiability results showing when a 1-cause (and up to $n_{\max}$-cause) TCR is unique, including explicit formulae for $\tau$ and $\omega$ mappings. They provide a scalable Linear TCR (LTCR) algorithm with Gaussian-consistency loss and regularizers to prevent collapse, and demonstrate the method on toy linear models, a double-well ODE system, and a spring-mass system, revealing interpretable, physically meaningful high-level drivers of the target. The work offers a principled, intervention-driven path to interpretability in scientific simulations, enabling domain experts to extract compact, causally meaningful explanations without requiring fully specified high-level models upfront.
Abstract
Why does a phenomenon occur? Addressing this question is central to most scientific inquiries and often relies on simulations of scientific models. As models become more intricate, deciphering the causes behind phenomena in high-dimensional spaces of interconnected variables becomes increasingly challenging. Causal Representation Learning (CRL) offers a promising avenue to uncover interpretable causal patterns within these simulations through an interventional lens. However, developing general CRL frameworks suitable for practical applications remains an open challenge. We introduce Targeted Causal Reduction (TCR), a method for condensing complex intervenable models into a concise set of causal factors that explain a specific target phenomenon. We propose an information theoretic objective to learn TCR from interventional data of simulations, establish identifiability for continuous variables under shift interventions and present a practical algorithm for learning TCRs. Its ability to generate interpretable high-level explanations from complex models is demonstrated on toy and mechanical systems, illustrating its potential to assist scientists in the study of complex phenomena in a broad range of disciplines.
