Table of Contents
Fetching ...

Optimizing Likelihoods via Mutual Information: Bridging Simulation-Based Inference and Bayesian Optimal Experimental Design

Vincent D. Zaballa, Elliot E. Hui

TL;DR

The paper addresses the challenge of integrating Simulation-Based Inference (SBI) with Bayesian Optimal Experimental Design (BOED) by formulating a mutual information objective, and introduces the InfoNCE-λ bound to jointly optimize experimental designs and amortized SBI surrogates for non-differentiable simulators. It establishes theoretical connections between SBI objectives and MI-based design gains, and presents a practical framework that uses a design distribution to stabilize gradient-based optimization. The authors demonstrate improved calibration and predictive accuracy over baselines on SBI-based epidemiology and Bone Morphogenetic Protein (BMP) models, while detailing trade-offs between MI tightness and likelihood fidelity through the regularization parameter λ. The work broadens the applicability of BOED to SBI tasks and provides concrete guidelines (design distributions, checkpoints, and calibration metrics) to ensure robust, interpretable experimental design decisions in complex scientific simulators.

Abstract

Simulation-based inference (SBI) is a method to perform inference on a variety of complex scientific models with challenging inference (inverse) problems. Bayesian Optimal Experimental Design (BOED) aims to efficiently use experimental resources to make better inferences. Various stochastic gradient-based BOED methods have been proposed as an alternative to Bayesian optimization and other experimental design heuristics to maximize information gain from an experiment. We demonstrate a link via mutual information bounds between SBI and stochastic gradient-based variational inference methods that permits BOED to be used in SBI applications as SBI-BOED. This link allows simultaneous optimization of experimental designs and optimization of amortized inference functions. We evaluate the pitfalls of naive design optimization using this method in a standard SBI task and demonstrate the utility of a well-chosen design distribution in BOED. We compare this approach on SBI-based models in real-world simulators in epidemiology and biology, showing notable improvements in inference.

Optimizing Likelihoods via Mutual Information: Bridging Simulation-Based Inference and Bayesian Optimal Experimental Design

TL;DR

The paper addresses the challenge of integrating Simulation-Based Inference (SBI) with Bayesian Optimal Experimental Design (BOED) by formulating a mutual information objective, and introduces the InfoNCE-λ bound to jointly optimize experimental designs and amortized SBI surrogates for non-differentiable simulators. It establishes theoretical connections between SBI objectives and MI-based design gains, and presents a practical framework that uses a design distribution to stabilize gradient-based optimization. The authors demonstrate improved calibration and predictive accuracy over baselines on SBI-based epidemiology and Bone Morphogenetic Protein (BMP) models, while detailing trade-offs between MI tightness and likelihood fidelity through the regularization parameter λ. The work broadens the applicability of BOED to SBI tasks and provides concrete guidelines (design distributions, checkpoints, and calibration metrics) to ensure robust, interpretable experimental design decisions in complex scientific simulators.

Abstract

Simulation-based inference (SBI) is a method to perform inference on a variety of complex scientific models with challenging inference (inverse) problems. Bayesian Optimal Experimental Design (BOED) aims to efficiently use experimental resources to make better inferences. Various stochastic gradient-based BOED methods have been proposed as an alternative to Bayesian optimization and other experimental design heuristics to maximize information gain from an experiment. We demonstrate a link via mutual information bounds between SBI and stochastic gradient-based variational inference methods that permits BOED to be used in SBI applications as SBI-BOED. This link allows simultaneous optimization of experimental designs and optimization of amortized inference functions. We evaluate the pitfalls of naive design optimization using this method in a standard SBI task and demonstrate the utility of a well-chosen design distribution in BOED. We compare this approach on SBI-based models in real-world simulators in epidemiology and biology, showing notable improvements in inference.

Paper Structure

This paper contains 28 sections, 1 theorem, 35 equations, 7 figures, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

Maximizing the lower bound of the Mutual Information (MI) between parameters $\theta$ and observations $y$ in a Simulation-Based Inference (SBI) setting is equivalent to minimizing the Kullback-Leibler (KL) divergence, $D_{\text{KL}}(p(y | \boldsymbol{\theta}) || p_\phi(y | \boldsymbol{\theta}))$, b where $I(\theta ; y)$ is the mutual information between $\theta$ and $y$, and $I_{\phi}(\theta ; y

Figures (7)

  • Figure 1: Comparison on the Two Moons task of the EIG and the validation loss $-\mathop{\mathrm{}\mathbb{E}}\nolimits \log p_\phi(y | \theta)$ across varying number of contrastive samples ($L = N - 1$) and $\lambda$ regularization. Increasing number of contrastive samples improves the information lower bound and likelihood validation metrics. The $\lambda$ parameter helps improve the likelihood accuracy at the expense of MI estimation.
  • Figure 2: Comparison of the EIG across design dimensions, type of BOED, and $\lambda$ regularization for the noisy linear model examining the moving average over 10 different random seed initializations. For the single design dimension, all SBI-BOED regularizaiton levels generally similar, with the non-regularized version being the closest to the optimal MI bound. In the higher-dimension design cases, SBI-BOED increases its EIG with more designs. In the 100-dimensional design case, we see the benefit of using $\lambda$ regularization to stabilize the training of a design-dependent normalizing flow in high-dimensional input space at the cost of slightly lower EIG.
  • Figure 3: Training (higher is better) without a design distribution in the first round of optimization for the SIR model fails to find designs with high rewards.
  • Figure 4: Training with design checkpoints saves an optimal design $\xi^*$ that achieves the highest EIG.
  • Figure 5: (Top) Posterior of the two moons experiments with mode collapse. (Bottom) Simulation-Based Calibration (SBC) of the posterior distribution for the two moons experiment.
  • ...and 2 more figures

Theorems & Definitions (2)

  • Theorem 3.1
  • proof