Table of Contents
Fetching ...

Towards Identifiability of Interventional Stochastic Differential Equations

Aaron Zweig, Zaikang Lin, Elham Azizi, David Knowles

TL;DR

This work addresses identifiability of interventional stochastic differential equations from stationary distributions, a setting common when trajectory data are scarce. It provides provable bounds for parameter recovery in the linear case (identifiable with exactly $r$ interventions and not with fewer) and extends to nonlinear drift in the small-noise limit, where identifiability of $A$ and $B^T$ can be achieved without knowing the exact activation function, given $r+1$ interventions. The authors also show that allowing learnable activations maintains identifiability while increasing expressiveness, enabling improved recovery and gene regulatory network inference in simulated and semi-synthetic genomics data. This work offers a principled foundation for causal inference in dynamical systems using stationary data, with potential impact on biology and systems modeling where longitudinal measurements are limited but interventional data are available.

Abstract

We study identifiability of stochastic differential equations (SDE) under multiple interventions. Our results give the first provable bounds for unique recovery of SDE parameters given samples from their stationary distributions. We give tight bounds on the number of necessary interventions for linear SDEs, and upper bounds for nonlinear SDEs in the small noise regime. We experimentally validate the recovery of true parameters in synthetic data, and motivated by our theoretical results, demonstrate the advantage of parameterizations with learnable activation functions in application to gene regulatory dynamics.

Towards Identifiability of Interventional Stochastic Differential Equations

TL;DR

This work addresses identifiability of interventional stochastic differential equations from stationary distributions, a setting common when trajectory data are scarce. It provides provable bounds for parameter recovery in the linear case (identifiable with exactly interventions and not with fewer) and extends to nonlinear drift in the small-noise limit, where identifiability of and can be achieved without knowing the exact activation function, given interventions. The authors also show that allowing learnable activations maintains identifiability while increasing expressiveness, enabling improved recovery and gene regulatory network inference in simulated and semi-synthetic genomics data. This work offers a principled foundation for causal inference in dynamical systems using stationary data, with potential impact on biology and systems modeling where longitudinal measurements are limited but interventional data are available.

Abstract

We study identifiability of stochastic differential equations (SDE) under multiple interventions. Our results give the first provable bounds for unique recovery of SDE parameters given samples from their stationary distributions. We give tight bounds on the number of necessary interventions for linear SDEs, and upper bounds for nonlinear SDEs in the small noise regime. We experimentally validate the recovery of true parameters in synthetic data, and motivated by our theoretical results, demonstrate the advantage of parameterizations with learnable activation functions in application to gene regulatory dynamics.

Paper Structure

This paper contains 30 sections, 13 theorems, 66 equations, 3 figures, 4 tables.

Key Result

Theorem 2.1

Consider the SDE Assume $L$ is Hurwitz, i.e., all its eigenvalues have strictly negative real parts, and $Q$ is full rank. Then the unique stationary distribution is $\mathcal{N}(-L^{-1}c, \omega)$ where $\omega$ is the unique solution to the Lyapunov equation,

Figures (3)

  • Figure 1: Contour plot of the stationary SDE under different activations and interventions. Activation contractivity enforces one mode, but the linear distribution is Gaussian with fixed covariance across interventions, while the nonlinear distribution can be more expressive.
  • Figure 2: Normalized Frobenius error of learned drift against true drift in linear SDEs with $k$ independent Gaussian interventions. Error bars are standard deviation over 5 independent runs.
  • Figure 3: Gene regulatory network recovery on three tested SDE models for 5 independent runs.

Theorems & Definitions (22)

  • Theorem 2.1: sarkka2019applied
  • Proposition 4.1
  • Theorem 4.4
  • Proposition 4.6
  • Theorem 4.7
  • Theorem 4.8
  • Theorem A.1
  • proof
  • Theorem A.2
  • proof
  • ...and 12 more