Table of Contents
Fetching ...

Compositional simulation-based inference for time series

Manuel Gloeckler, Shoji Toyota, Kenji Fukumizu, Jakob H. Macke

TL;DR

This paper tackles Bayesian inference for time-series simulators with intractable likelihoods by exploiting the Markov structure to learn local parameter posteriors from single-step transitions and then composing these local solutions into a global posterior for the entire trajectory. The core method, Factorized Neural Score Estimation (FNSE), trains a local score network on transitions $(\bm\theta, {\bm x}^t: {\bm x}^{t+1})$ and aggregates these scores across time using a diffusion-based sampler, with extensions to FNLE and FNRE for likelihood-based estimation. The approach demonstrates improved simulation efficiency over global, non-factorized SBI baselines across synthetic benchmarks and ecological/epidemiological models, and scales to high-dimensional systems like Kolmogorov flow with around one million data dimensions. Overall, the work shows that leveraging Markovian structure to perform localized inference and compositional aggregation can dramatically reduce simulation costs while maintaining accurate posterior inferences for long time series.

Abstract

Amortized simulation-based inference (SBI) methods train neural networks on simulated data to perform Bayesian inference. While this strategy avoids the need for tractable likelihoods, it often requires a large number of simulations and has been challenging to scale to time series data. Scientific simulators frequently emulate real-world dynamics through thousands of single-state transitions over time. We propose an SBI approach that can exploit such Markovian simulators by locally identifying parameters consistent with individual state transitions. We then compose these local results to obtain a posterior over parameters that align with the entire time series observation. We focus on applying this approach to neural posterior score estimation but also show how it can be applied, e.g., to neural likelihood (ratio) estimation. We demonstrate that our approach is more simulation-efficient than directly estimating the global posterior on several synthetic benchmark tasks and simulators used in ecology and epidemiology. Finally, we validate scalability and simulation efficiency of our approach by applying it to a high-dimensional Kolmogorov flow simulator with around one million data dimensions.

Compositional simulation-based inference for time series

TL;DR

This paper tackles Bayesian inference for time-series simulators with intractable likelihoods by exploiting the Markov structure to learn local parameter posteriors from single-step transitions and then composing these local solutions into a global posterior for the entire trajectory. The core method, Factorized Neural Score Estimation (FNSE), trains a local score network on transitions and aggregates these scores across time using a diffusion-based sampler, with extensions to FNLE and FNRE for likelihood-based estimation. The approach demonstrates improved simulation efficiency over global, non-factorized SBI baselines across synthetic benchmarks and ecological/epidemiological models, and scales to high-dimensional systems like Kolmogorov flow with around one million data dimensions. Overall, the work shows that leveraging Markovian structure to perform localized inference and compositional aggregation can dramatically reduce simulation costs while maintaining accurate posterior inferences for long time series.

Abstract

Amortized simulation-based inference (SBI) methods train neural networks on simulated data to perform Bayesian inference. While this strategy avoids the need for tractable likelihoods, it often requires a large number of simulations and has been challenging to scale to time series data. Scientific simulators frequently emulate real-world dynamics through thousands of single-state transitions over time. We propose an SBI approach that can exploit such Markovian simulators by locally identifying parameters consistent with individual state transitions. We then compose these local results to obtain a posterior over parameters that align with the entire time series observation. We focus on applying this approach to neural posterior score estimation but also show how it can be applied, e.g., to neural likelihood (ratio) estimation. We demonstrate that our approach is more simulation-efficient than directly estimating the global posterior on several synthetic benchmark tasks and simulators used in ecology and epidemiology. Finally, we validate scalability and simulation efficiency of our approach by applying it to a high-dimensional Kolmogorov flow simulator with around one million data dimensions.

Paper Structure

This paper contains 52 sections, 28 equations, 14 figures, 4 tables, 2 algorithms.

Figures (14)

  • Figure 1: Illustration of Factorized Neural Score Estimation (FNSE). The goal is to perform parameter inference on a full time series model. The training process uses a smaller subsets of single-state transitions initialized at arbitrary proposal $\tilde{p} ({\bm{x}}^t)$, with parameters sampled from a prior distribution. During inference, the time series is divided into single state transitions, and each state transition is evaluated by the neural network to estimate local posterior scores. These local estimates are then aggregated to form a global approximation, which is subsequently used to sample from the overall posterior distribution. Here, $a$ denotes the diffusion time, and ${\bm{\theta}}_a$ is the associated noisy parameter.
  • Figure 2: Benchmarks: We validate our method on a Gaussian Random Walk with different dimensions for different lengths (i.e. Transitions), also tracking sampling times (a). We assess FNSE score accumulation over Gaussian RW and Periodic SDE tasks using a fixed Euler–Maruyama sampler (b). We compare methods across tasks and transition steps (c). Finally, we examine the effect of the proposal on NFSE trained with 10k simulations from a normal proposal of varying standard deviation (d).
  • Figure 3: Lotka Volterra and SIR experiments : The FNSE approximate posterior (predictive) of the best performing FNSE model using 100k transition steps to train, visualized on subsequences from a fixed observation and associated true parameter. (a, b). We then show the quantitative performance in terms of C2ST and sW$_1$ for each task on ten randomly selected observations; each run is repeated five times (c, d).
  • Figure 4: Kolmogorov flow experiment: Single example observation of length hundred (top row). We visualize the posterior distribution condition on the observation up to $t=10$ (a, left top), along with a quantitative comparison of the mean absolute error (MAE) between posterior predictive samples and prior predictive samples on fifty different observations (left, bottom), including the one shown above. The vorticity of two selected predictive samples is visualized on the right. This analysis is then repeated for the entire observation (b).
  • Figure 5: Proposal: A set of trajectories within the phase space of the Lotka Volterra task for different parameterizations sampled from the prior (constructed using a total of 5k simulation steps) (a). The proposal was constructed by randomly sampling noisy points from the state space trajectories (b). Posterior approximation in sliced Wasserstein distance constructed by improved proposal compared to our "naively" chosen baseline (c).
  • ...and 9 more figures