Compositional simulation-based inference for time series
Manuel Gloeckler, Shoji Toyota, Kenji Fukumizu, Jakob H. Macke
TL;DR
This paper tackles Bayesian inference for time-series simulators with intractable likelihoods by exploiting the Markov structure to learn local parameter posteriors from single-step transitions and then composing these local solutions into a global posterior for the entire trajectory. The core method, Factorized Neural Score Estimation (FNSE), trains a local score network on transitions $(\bm\theta, {\bm x}^t: {\bm x}^{t+1})$ and aggregates these scores across time using a diffusion-based sampler, with extensions to FNLE and FNRE for likelihood-based estimation. The approach demonstrates improved simulation efficiency over global, non-factorized SBI baselines across synthetic benchmarks and ecological/epidemiological models, and scales to high-dimensional systems like Kolmogorov flow with around one million data dimensions. Overall, the work shows that leveraging Markovian structure to perform localized inference and compositional aggregation can dramatically reduce simulation costs while maintaining accurate posterior inferences for long time series.
Abstract
Amortized simulation-based inference (SBI) methods train neural networks on simulated data to perform Bayesian inference. While this strategy avoids the need for tractable likelihoods, it often requires a large number of simulations and has been challenging to scale to time series data. Scientific simulators frequently emulate real-world dynamics through thousands of single-state transitions over time. We propose an SBI approach that can exploit such Markovian simulators by locally identifying parameters consistent with individual state transitions. We then compose these local results to obtain a posterior over parameters that align with the entire time series observation. We focus on applying this approach to neural posterior score estimation but also show how it can be applied, e.g., to neural likelihood (ratio) estimation. We demonstrate that our approach is more simulation-efficient than directly estimating the global posterior on several synthetic benchmark tasks and simulators used in ecology and epidemiology. Finally, we validate scalability and simulation efficiency of our approach by applying it to a high-dimensional Kolmogorov flow simulator with around one million data dimensions.
