Neural Structure Learning with Stochastic Differential Equations

Benjie Wang; Joel Jennings; Wenbo Gong

Neural Structure Learning with Stochastic Differential Equations

Benjie Wang, Joel Jennings, Wenbo Gong

TL;DR

SCOTCH tackles structure learning for continuous-time stochastic processes from irregular time series using latent Itô diffusions and variational inference to infer a posterior over graphs. By modeling a latent state Z_t with dZ_t = f_θ(Z_t,G) dt + g_θ(Z_t,G) dW_t and observing X_t = Z_t + ε_t, SCOTCH achieves graph discovery via derivatives of f_G and g_G while enforcing a diagonal diffusion structure for identifiability. The authors prove structural identifiability and consistency of their variational framework under infinite data, and demonstrate superior performance against multiple baselines on synthetic and real datasets with irregular sampling. This work enables accurate, time-continuous structure discovery and prediction at arbitrary times, addressing key gaps in discrete-time causal learning and irregularly sampled data.

Abstract

Discovering the underlying relationships among variables from temporal observations has been a longstanding challenge in numerous scientific disciplines, including biology, finance, and climate science. The dynamics of such systems are often best described using continuous-time stochastic processes. Unfortunately, most existing structure learning approaches assume that the underlying process evolves in discrete-time and/or observations occur at regular time intervals. These mismatched assumptions can often lead to incorrect learned structures and models. In this work, we introduce a novel structure learning method, SCOTCH, which combines neural stochastic differential equations (SDE) with variational inference to infer a posterior distribution over possible structures. This continuous-time approach can naturally handle both learning from and predicting observations at arbitrary time points. Theoretically, we establish sufficient conditions for an SDE and SCOTCH to be structurally identifiable, and prove its consistency under infinite data limits. Empirically, we demonstrate that our approach leads to improved structure learning performance on both synthetic and real-world datasets compared to relevant baselines under regular and irregular sampling intervals.

Neural Structure Learning with Stochastic Differential Equations

TL;DR

Abstract

Paper Structure (80 sections, 9 theorems, 50 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 80 sections, 9 theorems, 50 equations, 6 figures, 6 tables, 1 algorithm.

Introduction
Preliminaries
Bayesian structure learning
Structural equation models (SEMs)
Itô diffusion
Euler discretization and Euler SEM
SCOTCH: Bayesian Structure Learning for Continuous Time Series
Prior over Graphs
Prior process
Likelihood of time series
Variational Inference
Stochasticity and Continuous-Time Modeling
Stochasticity
Discrete vs Continuous-Time
Theoretical considerations of SCOTCH
...and 65 more sections

Key Result

Theorem 4.1

Given eq: Observational process, let us define another process with $\bar{{\bm{X}}}_t$, ${\bm{G}}\neq \bar{{\bm{G}}}$, ${\bm{\bar{f}}_{\bar{G}}}$, ${\bm{\bar{g}}_{\bar{G}}}$ and $\bar{{\bm{W}}}_t$. Then, under Assumptions assump: Global Lipschitz-assump: diagonal diffusion, and with the same initial

Figures (6)

Figure 1: Comparison between NGM and SCOTCH for simple SDE (note vertical axis scale)
Figure 2: Comparison between NGM and SCOTCH for simple SDE (note vertical axis scale)
Figure 3: The AUROC (top left), F1 score (top right), false discovery rate (bottom left) and true positive rate (bottom right) curves of SCOTCH for Lorenz dataset. The shaded area indicates the $95\%$ confidence intervals. Blue color indicates the dataset with missing probability $0.3$ and orange color indicates missing probability $0.6$.
Figure 4: The AUROC (top left), F1 score (top right), false discovery rate (bottom left) and true positive rate (bottom right) curves of SCOTCH for Glycolysis dataset. The shaded area indicates the $95\%$ confidence intervals. Blue color indicates the normalized dataset and orange color indicates the original dataset.
Figure 5: The AUROC (top left), F1 score (top right), false discovery rate (bottom left) and true positive rate (bottom right) curves of SCOTCH for each DREAM3 sub-datasets. The shaded area indicates the $95\%$ confidence intervals.
...and 1 more figures

Theorems & Definitions (18)

Theorem 4.1: Structure identifiability of the observational process
Theorem 4.2: Structural identifiability with latent formulation
Theorem 4.3: Consistency of variational formulation
Definition 1: Feller process and semi-group
Definition 2: Infinitesimal generator
Theorem A.1: Structure identifiability of the observational process
Lemma A.1: Identifiability of Euler SEM
proof
Lemma A.2: Generator characterises Euler SEM
proof
...and 8 more

Neural Structure Learning with Stochastic Differential Equations

TL;DR

Abstract

Neural Structure Learning with Stochastic Differential Equations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (18)