Table of Contents
Fetching ...

Embed and Emulate: Contrastive representations for simulation-based inference

Ruoxi Jiang, Peter Y. Lu, Rebecca Willett

TL;DR

E&E is introduced, a new SBI method based on contrastive learning that efficiently handles high-dimensional data and complex, multimodal parameter posteriors and demonstrates superior performance over existing methods in a realistic, non-identifiable parameter estimation task using the high-dimensional, chaotic Lorenz 96 system.

Abstract

Scientific modeling and engineering applications rely heavily on parameter estimation methods to fit physical models and calibrate numerical simulations using real-world measurements. In the absence of analytic statistical models with tractable likelihoods, modern simulation-based inference (SBI) methods first use a numerical simulator to generate a dataset of parameters and simulated outputs. This dataset is then used to approximate the likelihood and estimate the system parameters given observation data. Several SBI methods employ machine learning emulators to accelerate data generation and parameter estimation. However, applying these approaches to high-dimensional physical systems remains challenging due to the cost and complexity of training high-dimensional emulators. This paper introduces Embed and Emulate (E&E): a new SBI method based on contrastive learning that efficiently handles high-dimensional data and complex, multimodal parameter posteriors. E&E learns a low-dimensional latent embedding of the data (i.e., a summary statistic) and a corresponding fast emulator in the latent space, eliminating the need to run expensive simulations or a high dimensional emulator during inference. We illustrate the theoretical properties of the learned latent space through a synthetic experiment and demonstrate superior performance over existing methods in a realistic, non-identifiable parameter estimation task using the high-dimensional, chaotic Lorenz 96 system.

Embed and Emulate: Contrastive representations for simulation-based inference

TL;DR

E&E is introduced, a new SBI method based on contrastive learning that efficiently handles high-dimensional data and complex, multimodal parameter posteriors and demonstrates superior performance over existing methods in a realistic, non-identifiable parameter estimation task using the high-dimensional, chaotic Lorenz 96 system.

Abstract

Scientific modeling and engineering applications rely heavily on parameter estimation methods to fit physical models and calibrate numerical simulations using real-world measurements. In the absence of analytic statistical models with tractable likelihoods, modern simulation-based inference (SBI) methods first use a numerical simulator to generate a dataset of parameters and simulated outputs. This dataset is then used to approximate the likelihood and estimate the system parameters given observation data. Several SBI methods employ machine learning emulators to accelerate data generation and parameter estimation. However, applying these approaches to high-dimensional physical systems remains challenging due to the cost and complexity of training high-dimensional emulators. This paper introduces Embed and Emulate (E&E): a new SBI method based on contrastive learning that efficiently handles high-dimensional data and complex, multimodal parameter posteriors. E&E learns a low-dimensional latent embedding of the data (i.e., a summary statistic) and a corresponding fast emulator in the latent space, eliminating the need to run expensive simulations or a high dimensional emulator during inference. We illustrate the theoretical properties of the learned latent space through a synthetic experiment and demonstrate superior performance over existing methods in a realistic, non-identifiable parameter estimation task using the high-dimensional, chaotic Lorenz 96 system.
Paper Structure (33 sections, 19 theorems, 92 equations, 10 figures, 2 tables, 3 algorithms)

This paper contains 33 sections, 19 theorems, 92 equations, 10 figures, 2 tables, 3 algorithms.

Key Result

Theorem 1

The asymptotic symmetric inter-domain InfoNCE loss bounds the KL divergence between the true posterior $p({\boldsymbol{\phi}}\mid\mathbf{y})$ and the model $\hat{q}_\theta({\boldsymbol{\phi}}\mid\mathbf{y})$: where $I({\boldsymbol{\phi}},\mathbf{y}) := D_\mathrm{KL}(p({\boldsymbol{\phi}},\mathbf{y})\,\|\,p({\boldsymbol{\phi}})\,p(\mathbf{y}))$ is the mutual information between ${\boldsymbol{\phi

Figures (10)

  • Figure 1: Diagram of posterior inference using E&E. For each observation $\mathbf{y}$, our approach requires a single forward pass through the encoder $f_\theta$, which processes the high-dimensional data to produce a lower-dimensional embedding. This embedding is then combined with the emulator’s outputs $\hat{g}_\theta({\boldsymbol{\phi}})$ to parameterize the posterior estimator. Posterior samples ${\boldsymbol{\phi}}^{(s)} \sim p({\boldsymbol{\phi}})$ are drawn using a posterior sampling algorithm that repeatedly calls the learned emulator $\hat{g}_\theta$.
  • Figure 2: Diagram of generative process model with latent space $\mathcal{Z}$ reconstructed by the learned E&E embedding space $\mathbb{S}^{n-1}$. The generative process $p(\mathbf{y}\mid{\boldsymbol{\phi}})$ described in Definition \ref{['def:toymodel']} has a structured latent space $\mathcal{Z}$ defined by a constraint function $f:\mathcal{Y}\to\mathcal{Z}$ and a generator $g:\Phi\to\mathcal{Z}$. After training, the learned E&E embeddings $\hat{f}_\theta: \mathcal{Y}\to\mathbb{S}^{n-1}$, $\hat{g}_\theta: \Phi\to\mathbb{S}^{n-1}$ exactly reconstruct $f, g$ up to a rotation of the latent space (Theorem \ref{['thm:rotation']}).
  • Figure 3: Visualization of the estimated joint and marginal posterior distributions for one selected test sample for multimodal task. The results verify that the latent emulator learns to ignore irrelevant information (in the form of a redundant parameter ${\boldsymbol{\phi}}_\mathrm{R}$), as predicted by our theory.
  • Figure 4: Visual comparison of the estimated joint and marginal posterior distributions for one test sample. In each subplot, the heatmap displays the estimated posterior probabilities (with the maximum value clipped for better visualization), and the red dashed circle represents the ground truth reference distribution. The histograms showing marginal densities in the upper and right portions of each subplot are plotted using samples drawn from the estimated posterior using the acceptance-rejection sampling, with the dashed black line illustrating the histograms of samples from the reference distribution. The results illustrate that E&E captures the full spread of the posterior, whereas NRE-C miller2022contrastive and NPE-C greenberg2019automatic concentrate on a limited region of the circle, resulting in skewed estimates.
  • Figure 5: Comparison of samples quality using maximum mean discrepancy over 50 testing instances (MMD). Each box plot shows the median (25th, 75th percentiles) of the error statistics. We compare E&E with NRE-C miller2022contrastive and NPE-C greenberg2019automatic. The results demonstrate that E&E achieves significantly lower errors, with a substantially reduced variance in error statistics, indicating more consistent and reliable performance.
  • ...and 5 more figures

Theorems & Definitions (32)

  • Theorem 1
  • Corollary 1
  • Theorem 2
  • Corollary 2
  • Corollary 2
  • Definition 3
  • Lemma 3
  • Theorem 4
  • Lemma 5
  • proof
  • ...and 22 more