Table of Contents
Fetching ...

Marrying Causal Representation Learning with Dynamical Systems for Science

Dingling Yao, Caroline Muller, Francesco Locatello

TL;DR

This work bridges causal representation learning and dynamical systems to enable provable parameter identifiability in real-world time series. By encoding trajectory data into latent, time-invariant parameters and decoding with scalable mechanistic neural solvers, the approach achieves full identifiability when the dynamic form is known and partial identifiability under unknown dynamics through multiview CRL losses. It demonstrates practical benefits on wind and sea-surface temperature data, enabling downstream causal tasks such as OOD classification and ATE estimation, thereby advancing scientific inference from complex measurements. The results highlight the synergy between CRL identifiability theory and mechanistic neural networks for scalable, interpretable scientific modeling with real-world impact in climate science.

Abstract

Causal representation learning promises to extend causal models to hidden causal variables from raw entangled measurements. However, most progress has focused on proving identifiability results in different settings, and we are not aware of any successful real-world application. At the same time, the field of dynamical systems benefited from deep learning and scaled to countless applications but does not allow parameter identification. In this paper, we draw a clear connection between the two and their key assumptions, allowing us to apply identifiable methods developed in causal representation learning to dynamical systems. At the same time, we can leverage scalable differentiable solvers developed for differential equations to build models that are both identifiable and practical. Overall, we learn explicitly controllable models that isolate the trajectory-specific parameters for further downstream tasks such as out-of-distribution classification or treatment effect estimation. We experiment with a wind simulator with partially known factors of variation. We also apply the resulting model to real-world climate data and successfully answer downstream causal questions in line with existing literature on climate change.

Marrying Causal Representation Learning with Dynamical Systems for Science

TL;DR

This work bridges causal representation learning and dynamical systems to enable provable parameter identifiability in real-world time series. By encoding trajectory data into latent, time-invariant parameters and decoding with scalable mechanistic neural solvers, the approach achieves full identifiability when the dynamic form is known and partial identifiability under unknown dynamics through multiview CRL losses. It demonstrates practical benefits on wind and sea-surface temperature data, enabling downstream causal tasks such as OOD classification and ATE estimation, thereby advancing scientific inference from complex measurements. The results highlight the synergy between CRL identifiability theory and mechanistic neural networks for scalable, interpretable scientific modeling with real-world impact in climate science.

Abstract

Causal representation learning promises to extend causal models to hidden causal variables from raw entangled measurements. However, most progress has focused on proving identifiability results in different settings, and we are not aware of any successful real-world application. At the same time, the field of dynamical systems benefited from deep learning and scaled to countless applications but does not allow parameter identification. In this paper, we draw a clear connection between the two and their key assumptions, allowing us to apply identifiable methods developed in causal representation learning to dynamical systems. At the same time, we can leverage scalable differentiable solvers developed for differential equations to build models that are both identifiable and practical. Overall, we learn explicitly controllable models that isolate the trajectory-specific parameters for further downstream tasks such as out-of-distribution classification or treatment effect estimation. We experiment with a wind simulator with partially known factors of variation. We also apply the resulting model to real-world climate data and successfully answer downstream causal questions in line with existing literature on climate change.
Paper Structure (22 sections, 4 theorems, 22 equations, 8 figures, 7 tables)

This paper contains 22 sections, 4 theorems, 22 equations, 8 figures, 7 tables.

Key Result

Corollary 3.1

Consider a trajectory $\mathbf{x} \in \mathcal{X}^T$ generated from a ODE $f_{\bm{\theta}}(\mathbf{x}(t))$ satisfying assmp:exist_uniqueassmp:structural_ident, let $\hat{\bm{\theta}}$ be an estimator minimizing the following objective: then the parameter $\bm{\theta}$ is fully-identified (def:full_ident) by the estimator $\hat{\bm{\theta}}$.

Figures (8)

  • Figure 1: Model overview with sea surface temperature inputs: Our mechanistic identifier extracts the underlying time-invariant latitude-related parameters $\bm{\theta}$, providing a versatile neural emulator for downstream causal analysis.
  • Figure 2: Wind simulation: $u, v$ components [m/s] of simulated air motion over the globe.
  • Figure 3: Prediction accuracy on layer thickness parameter on wind simulation data, evaluated on individual encoding partitions $S_1, S_2, S_3$. Results averaged from three random runs.
  • Figure 4: Left: Underlying causal model for SST-V2 data, $\bm{\theta}$: covariates (latitude-related parameters of interest), $\mathbf{X}$: outcome (zonal average temperature), $\mathbf{T}$: treatment (tropical $\mathbf{T}=0$ or polar $\mathbf{T}=1$). Right: Comparison on ATE change ratio between identified and non-identified parameters, computed by $ATE(year) - ATE(1990)/ATE(1990)$, averaged over three runs.
  • Figure 5: Example of wind simulation: Left: longitudinal wind velocity ($u$) [m/s]. Middle: latitudinal wind velocity $(v)$[m/s], Right: relative vorticity ($vor$) [1/s].
  • ...and 3 more figures

Theorems & Definitions (9)

  • Remark 2.1
  • Definition 3.1: Full identifiability
  • Definition 3.2: ODE solver
  • Corollary 3.1: Full identifiability with known functional form
  • Remark 3.1
  • Definition 3.3: Partial identifiability
  • Corollary 3.2: Identifiability without known functional form
  • Corollary B.1: Full identifiability with known functional form
  • Corollary B.1: Identifiability without known functional form