Table of Contents
Fetching ...

Diffusion posterior sampling for simulation-based inference in tall data settings

Julia Linhart, Gabriel Victorino Cardoso, Alexandre Gramfort, Sylvain Le Corff, Pedro L. C. Rodrigues

TL;DR

This work tackles inference for complex simulators in tall data settings by marrying score-based diffusion modeling with compositional posterior scores. It introduces exact and second-order approximations to the diffusion of the tall posterior, enabling deterministic score-based samplers (e.g., DDIM) to replace Langevin steps used in prior tall-data SBI methods. The proposed GAUSS and JAC algorithms demonstrate speedups, improved stability, and robust performance across Gaussian toys, SBI benchmarks, and a real neural mass model, highlighting the practical impact of diffusion-based tall-data inference. Overall, the method enhances simulation-based inference by providing scalable, amortized, and compositionally accurate posterior sampling without costly Langevin dynamics.

Abstract

Identifying the parameters of a non-linear model that best explain observed data is a core task across scientific fields. When such models rely on complex simulators, evaluating the likelihood is typically intractable, making traditional inference methods such as MCMC inapplicable. Simulation-based inference (SBI) addresses this by training deep generative models to approximate the posterior distribution over parameters using simulated data. In this work, we consider the tall data setting, where multiple independent observations provide additional information, allowing sharper posteriors and improved parameter identifiability. Building on the flourishing score-based diffusion literature, F-NPSE (Geffner et al., 2023) estimates the tall data posterior by composing individual scores from a neural network trained only for a single context observation. This enables more flexible and simulation-efficient inference than alternative approaches for tall datasets in SBI. However, it relies on costly Langevin dynamics during sampling. We propose a new algorithm that eliminates the need for Langevin steps by explicitly approximating the diffusion process of the tall data posterior. Our method retains the advantages of compositional score-based inference while being significantly faster and more stable than F-NPSE. We demonstrate its improved performance on toy problems and standard SBI benchmarks, and showcase its scalability by applying it to a complex real-world model from computational neuroscience.

Diffusion posterior sampling for simulation-based inference in tall data settings

TL;DR

This work tackles inference for complex simulators in tall data settings by marrying score-based diffusion modeling with compositional posterior scores. It introduces exact and second-order approximations to the diffusion of the tall posterior, enabling deterministic score-based samplers (e.g., DDIM) to replace Langevin steps used in prior tall-data SBI methods. The proposed GAUSS and JAC algorithms demonstrate speedups, improved stability, and robust performance across Gaussian toys, SBI benchmarks, and a real neural mass model, highlighting the practical impact of diffusion-based tall-data inference. Overall, the method enhances simulation-based inference by providing scalable, amortized, and compositionally accurate posterior sampling without costly Langevin dynamics.

Abstract

Identifying the parameters of a non-linear model that best explain observed data is a core task across scientific fields. When such models rely on complex simulators, evaluating the likelihood is typically intractable, making traditional inference methods such as MCMC inapplicable. Simulation-based inference (SBI) addresses this by training deep generative models to approximate the posterior distribution over parameters using simulated data. In this work, we consider the tall data setting, where multiple independent observations provide additional information, allowing sharper posteriors and improved parameter identifiability. Building on the flourishing score-based diffusion literature, F-NPSE (Geffner et al., 2023) estimates the tall data posterior by composing individual scores from a neural network trained only for a single context observation. This enables more flexible and simulation-efficient inference than alternative approaches for tall datasets in SBI. However, it relies on costly Langevin dynamics during sampling. We propose a new algorithm that eliminates the need for Langevin steps by explicitly approximating the diffusion process of the tall data posterior. Our method retains the advantages of compositional score-based inference while being significantly faster and more stable than F-NPSE. We demonstrate its improved performance on toy problems and standard SBI benchmarks, and showcase its scalability by applying it to a complex real-world model from computational neuroscience.
Paper Structure (52 sections, 2 theorems, 64 equations, 29 figures, 8 tables, 2 algorithms)

This paper contains 52 sections, 2 theorems, 64 equations, 29 figures, 8 tables, 2 algorithms.

Key Result

Lemma 3.1

Let $\Lambda(\theta) = \sum_{j=1}^{n} \bwcovinv{\theta}[t, j] + (1-n) \bwcovinv{\theta}[t, \lambda]$ and assume it is positive definite. The approximate log-correction term is defined using (eq:tweedie) and can be written as a linear combination of Gaussian log-factors:The Gaussian log-factors (deno where $\zeta_k = \zeta(\boldsymbol{\mu}_{t,k}, \Sigma_{t,k}) = -\frac{1}{2}\left(m\log 2\pi - \log|

Figures (29)

  • Figure 1: The posterior distribution of a model with a Gaussian simulator and Gaussian prior concentrates around the true parameter$\theta^\star$ as the number $n$ of observations ${x^\star_i \sim p(x \mid \theta^\star)}$ increases. The analytic posterior is compared to the posterior estimated with our score-based proposal (Algorithm \ref{['alg:tallscore_gauss']}: GAUSS).
  • Figure 2: Sliced Wasserstein (sW) distance as a function of $n$ and for increasing noise levels $\epsilon$. Results are shown for both Gaussian toy examples with $m=10$. Mean and std over 5 different seeds.
  • Figure 3: Sliced Wasserstein (sW) distance as a function of $N_\mathrm{train}$ and for increasing $n$, between samples obtained by GAUSS, JAC and LANGEVIN, and the true tall posterior $p(\theta \mid x^\star_{1,n})$. Mean and std over 25 seeds.
  • Figure 4: Inference on the 3D JRNMM (fixed $g = 0$) with GAUSS. (Left): MMD between the marginals of the approximate posterior and the Dirac of the true parameters $\theta^\star$ (black dashed lines). (Right): Histograms of the 1D marginals of the inferred posterior for $30$ single observations ($n=1$) and sets $x^\star_{1:n}$ of increasing size.
  • Figure 5: Inference on the 4D JRNMM with GAUSS. (Left): MMD between the marginals of the approximate posterior and the Dirac of the true parameters $\theta^\star$ (black dots and dashed lines). (Right): Histograms of the 1D and 2D marginals of the inferred posterior for observation sets $x^\star_{1:n}$ of increasing size.
  • ...and 24 more figures

Theorems & Definitions (2)

  • Lemma 3.1
  • Lemma 3.2