Table of Contents
Fetching ...

InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences

Hongkai Zheng, Wenda Chu, Bingliang Zhang, Zihui Wu, Austin Wang, Berthy T. Feng, Caifeng Zou, Yu Sun, Nikola Kovachki, Zachary E. Ross, Katherine L. Bouman, Yisong Yue

TL;DR

InverseBench introduces a modular benchmarking framework to evaluate plug-and-play diffusion priors (PnPDP) across five scientific inverse problems. It characterizes four main PnPDP categories—Guidance-based, Variable splitting, Variational Bayes, and Sequential Monte Carlo—and compares 14 algorithms against strong domain baselines, using open-source datasets and pretrained diffusion priors. Empirical results show that well-trained diffusion priors enable strong performance, but forward-model constraints, initialization, and out-of-distribution sources can limit effectiveness, particularly under high measurement sparsity. The work highlights stability, scalability, and robustness as key directions for future development and provides a valuable resource for researchers tackling physics-based inverse problems.

Abstract

Plug-and-play diffusion priors (PnPDP) have emerged as a promising research direction for solving inverse problems. However, current studies primarily focus on natural image restoration, leaving the performance of these algorithms in scientific inverse problems largely unexplored. To address this gap, we introduce \textsc{InverseBench}, a framework that evaluates diffusion models across five distinct scientific inverse problems. These problems present unique structural challenges that differ from existing benchmarks, arising from critical scientific applications such as optical tomography, medical imaging, black hole imaging, seismology, and fluid dynamics. With \textsc{InverseBench}, we benchmark 14 inverse problem algorithms that use plug-and-play diffusion priors against strong, domain-specific baselines, offering valuable new insights into the strengths and weaknesses of existing algorithms. To facilitate further research and development, we open-source the codebase, along with datasets and pre-trained models, at https://devzhk.github.io/InverseBench/.

InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences

TL;DR

InverseBench introduces a modular benchmarking framework to evaluate plug-and-play diffusion priors (PnPDP) across five scientific inverse problems. It characterizes four main PnPDP categories—Guidance-based, Variable splitting, Variational Bayes, and Sequential Monte Carlo—and compares 14 algorithms against strong domain baselines, using open-source datasets and pretrained diffusion priors. Empirical results show that well-trained diffusion priors enable strong performance, but forward-model constraints, initialization, and out-of-distribution sources can limit effectiveness, particularly under high measurement sparsity. The work highlights stability, scalability, and robustness as key directions for future development and provides a valuable resource for researchers tackling physics-based inverse problems.

Abstract

Plug-and-play diffusion priors (PnPDP) have emerged as a promising research direction for solving inverse problems. However, current studies primarily focus on natural image restoration, leaving the performance of these algorithms in scientific inverse problems largely unexplored. To address this gap, we introduce \textsc{InverseBench}, a framework that evaluates diffusion models across five distinct scientific inverse problems. These problems present unique structural challenges that differ from existing benchmarks, arising from critical scientific applications such as optical tomography, medical imaging, black hole imaging, seismology, and fluid dynamics. With \textsc{InverseBench}, we benchmark 14 inverse problem algorithms that use plug-and-play diffusion priors against strong, domain-specific baselines, offering valuable new insights into the strengths and weaknesses of existing algorithms. To facilitate further research and development, we open-source the codebase, along with datasets and pre-trained models, at https://devzhk.github.io/InverseBench/.

Paper Structure

This paper contains 65 sections, 27 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: Illustration of five benchmark problems in the InverseBench. $G$ represents the forward model that produces observations from the source. $G^{\dagger}$ represents the inverse map. In the linear inverse scattering problem (left two), the observation is the recorded data from the receivers and the unknown source we aim to infer is the permittivity map of the object. The bottom panel displays the efficiency and accuracy plots for our benchmarked algorithms. Certain characteristics of the problem cause the efficiency and accuracy trade-offs of each algorithm to vary across tasks. In these plots, the larger radius of the points indicates greater interaction with the forward function $G$, as measured by the number of forward model evaluations.
  • Figure 2: Qualitative comparison showing representative examples of PnP-DP methods and domain-specific baselines across five inverse problems. Note that for full waveform inversion, Adam$^*$ and LBFGS$^*$ are initialized with Gaussian-blurred ground truth, serving as references.
  • Figure 3: Illustration of the failures of PnPDP methods (DAPS as an example) on full waveform inversion. With a small learning rate, DAPS is numerically stable but does not solve the inverse problem effectively. With a slightly larger learning rate, DAPS produces a noisy velocity map that breaks the stability condition of the PDE solver, resulting in a complete failure.
  • Figure 4: Relative performance of plug-and-play diffusion prior methods compared with traditional baselines under different levels of measurement sparsity on different tasks. Metrics are averaged over multiple PnPDP methods. The performance difference increases in general as the measurement becomes sparser.
  • Figure 5: PnPDP methods on out-of-distribution test samples. (a) Black-hole imaging problem on digits inputs; and (b) inverse scattering on sources that contain 9 cells, while the prior model is trained on images with 1 to 6 cells.
  • ...and 3 more figures