Towards Identifiability of Interventional Stochastic Differential Equations
Aaron Zweig, Zaikang Lin, Elham Azizi, David Knowles
TL;DR
This work addresses identifiability of interventional stochastic differential equations from stationary distributions, a setting common when trajectory data are scarce. It provides provable bounds for parameter recovery in the linear case (identifiable with exactly $r$ interventions and not with fewer) and extends to nonlinear drift in the small-noise limit, where identifiability of $A$ and $B^T$ can be achieved without knowing the exact activation function, given $r+1$ interventions. The authors also show that allowing learnable activations maintains identifiability while increasing expressiveness, enabling improved recovery and gene regulatory network inference in simulated and semi-synthetic genomics data. This work offers a principled foundation for causal inference in dynamical systems using stationary data, with potential impact on biology and systems modeling where longitudinal measurements are limited but interventional data are available.
Abstract
We study identifiability of stochastic differential equations (SDE) under multiple interventions. Our results give the first provable bounds for unique recovery of SDE parameters given samples from their stationary distributions. We give tight bounds on the number of necessary interventions for linear SDEs, and upper bounds for nonlinear SDEs in the small noise regime. We experimentally validate the recovery of true parameters in synthetic data, and motivated by our theoretical results, demonstrate the advantage of parameterizations with learnable activation functions in application to gene regulatory dynamics.
