Table of Contents
Fetching ...

Bridging Simulators with Conditional Optimal Transport

Justine Zeghal, Benjamin Remy, Yashar Hezaveh, Francois Lanusse, Laurence Perreault Levasseur

TL;DR

Problem: pixel-level cosmological inference is hindered by mismatches between fast approximations and high-fidelity simulations, limiting accurate posterior recovery. Approach: a flow-based emulator using Conditional Optimal Transport Flow Matching (COT-FM) bridges two simulators by learning a triangular, parameter-conditioned transport that minimizes displacement of likelihoods $p_0(\theta, x)$ to $p_1(\theta, x)$, with velocity fields learned via Flow Matching and minibatch optimization for unpaired data. Contributions: demonstrates LPT-to-PM bridging on weak-lensing convergence maps, enabling implicit full-field inference that recovers the true posterior and calibrated coverage (e.g., via TARP/ECP tests), while remaining differentiable for gradient-based inference. Significance: enables accurate, pixel-level emulation without requiring paired simulations, with potential applicability to Stage-IV surveys and broader bridging tasks between complex simulators.

Abstract

We propose a new field-level emulator that bridges two simulators using unpaired simulation datasets. Our method leverages a flow-based approach to learn the likelihood transport from one simulator to the other. Since multiple transport maps exist, we employ Conditional Optimal Transport Flow Matching (COT-FM) to ensure that the transformation minimally distorts the underlying structure of the data. We demonstrate the effectiveness of this approach by bridging weak lensing simulators: a Lagrangian Perturbation Theory (LPT) to a N-body Particle-Mesh (PM). We demonstrate that our emulator captures the full correction between the simulators by showing that it enables full-field inference to accurately recover the true posterior, validating its accuracy beyond traditional summary statistics.

Bridging Simulators with Conditional Optimal Transport

TL;DR

Problem: pixel-level cosmological inference is hindered by mismatches between fast approximations and high-fidelity simulations, limiting accurate posterior recovery. Approach: a flow-based emulator using Conditional Optimal Transport Flow Matching (COT-FM) bridges two simulators by learning a triangular, parameter-conditioned transport that minimizes displacement of likelihoods to , with velocity fields learned via Flow Matching and minibatch optimization for unpaired data. Contributions: demonstrates LPT-to-PM bridging on weak-lensing convergence maps, enabling implicit full-field inference that recovers the true posterior and calibrated coverage (e.g., via TARP/ECP tests), while remaining differentiable for gradient-based inference. Significance: enables accurate, pixel-level emulation without requiring paired simulations, with potential applicability to Stage-IV surveys and broader bridging tasks between complex simulators.

Abstract

We propose a new field-level emulator that bridges two simulators using unpaired simulation datasets. Our method leverages a flow-based approach to learn the likelihood transport from one simulator to the other. Since multiple transport maps exist, we employ Conditional Optimal Transport Flow Matching (COT-FM) to ensure that the transformation minimally distorts the underlying structure of the data. We demonstrate the effectiveness of this approach by bridging weak lensing simulators: a Lagrangian Perturbation Theory (LPT) to a N-body Particle-Mesh (PM). We demonstrate that our emulator captures the full correction between the simulators by showing that it enables full-field inference to accurately recover the true posterior, validating its accuracy beyond traditional summary statistics.

Paper Structure

This paper contains 13 sections, 5 equations, 6 figures.

Figures (6)

  • Figure 1: From left to right: An example of LPT simulations with noise; The corresponding PM simulation sharing the same noise, cosmological parameters and initial conditions as the LPT simulation; The LPT simulations optimally transporting onto the PM space; the residuals, i.e., the difference of the learned PM map with the true PM map without noise.
  • Figure 2: Posterior distributions of the cosmological parameters evaluated on the PM map shown in \ref{['fig:convergence_maps']}, and learned from LPT simulations (green), PM simulations (purple), and emulated simulations (blue).
  • Figure 3: TARP expected coverage test over $500$ simulations, with error bars from $100$ bootstrap iterations. The dashed white line shows ideal coverage. Posteriors from PM maps analyzed with the correct model (purple) closely match this curve, while using the LPT model (green) leads to bias. COT-FM restores accurate coverage (blue), demonstrating effective emulation of PM maps.
  • Figure 4: Simulations comparison in summary space. Using the compressor trained on PM maps, we compress PM maps (purple), learned PM maps (blue), and LPT maps (green). We show that our emulated PM maps align with the true PM ones.
  • Figure 5: Additional posterior distributions of the cosmological parameters evaluated on the PM fiducial maps shown in \ref{['fig:convergence_maps_appendix']}. The green contour corresponds to the posterior distribution learned using LPT simulations. The purple one corresponds to the posterior distribution learned using PM simulations. The blue one is the posterior distribution learned from corrected LPT simulations, i.e., PM-like simulations. This highlights the impact of model misspecification on the cosmological constraints and how COT-FM solves this problem.
  • ...and 1 more figures