Table of Contents
Fetching ...

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization

Dongxia Wu, Nikki Lijing Kuang, Ruijia Niu, Yi-An Ma, Rose Yu

TL;DR

Diffusion-BBO tackles online black-box optimization by using a conditional diffusion model as an inverse surrogate to stay on the data manifold of feasible designs. It introduces Uncertainty-aware Exploration (UaE), an acquisition that balances high conditioning values with low epistemic uncertainty to drive efficient online querying. The authors provide theoretical results showing near-optimality of UaE and demonstrate strong empirical performance across six scientific-discovery tasks, including both continuous and discrete design spaces. The approach leverages classifier-free guidance to enable uncertainty quantification without training a separate classifier, and uses an ensemble to decompose epistemic and aleatoric uncertainty in the sampling process. Overall, Diffusion-BBO offers a principled, sample-efficient framework for online BBO with diffusion-based inverse surrogates and robust practical performance for scientific discovery tasks.

Abstract

Online black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle in a sample-efficient way. While prior studies focus on forward approaches such as Gaussian Processes (GPs) to learn a surrogate model for the unknown objective function, they struggle with steering clear of out-of-distribution and invalid designs in scientific discovery tasks. Recently, inverse modeling approaches that map the objective space to the design space with conditional diffusion models have demonstrated impressive capability in learning the data manifold. However, these approaches proceed in an offline fashion with pre-collected data. How to design inverse approaches for online BBO to actively query new data and improve the sample efficiency remains an open question. In this work, we propose Diffusion-BBO, a sample-efficient online BBO framework leveraging the conditional diffusion model as the inverse surrogate model. Diffusion-BBO employs a novel acquisition function Uncertainty-aware Exploration (UaE) to propose scores in the objective space for conditional sampling. We theoretically prove that Diffusion-BBO with UaE achieves a near-optimal solution for online BBO. We also empirically demonstrate that Diffusion-BBO with UaE outperforms existing online BBO baselines across 6 scientific discovery tasks.

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization

TL;DR

Diffusion-BBO tackles online black-box optimization by using a conditional diffusion model as an inverse surrogate to stay on the data manifold of feasible designs. It introduces Uncertainty-aware Exploration (UaE), an acquisition that balances high conditioning values with low epistemic uncertainty to drive efficient online querying. The authors provide theoretical results showing near-optimality of UaE and demonstrate strong empirical performance across six scientific-discovery tasks, including both continuous and discrete design spaces. The approach leverages classifier-free guidance to enable uncertainty quantification without training a separate classifier, and uses an ensemble to decompose epistemic and aleatoric uncertainty in the sampling process. Overall, Diffusion-BBO offers a principled, sample-efficient framework for online BBO with diffusion-based inverse surrogates and robust practical performance for scientific discovery tasks.

Abstract

Online black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle in a sample-efficient way. While prior studies focus on forward approaches such as Gaussian Processes (GPs) to learn a surrogate model for the unknown objective function, they struggle with steering clear of out-of-distribution and invalid designs in scientific discovery tasks. Recently, inverse modeling approaches that map the objective space to the design space with conditional diffusion models have demonstrated impressive capability in learning the data manifold. However, these approaches proceed in an offline fashion with pre-collected data. How to design inverse approaches for online BBO to actively query new data and improve the sample efficiency remains an open question. In this work, we propose Diffusion-BBO, a sample-efficient online BBO framework leveraging the conditional diffusion model as the inverse surrogate model. Diffusion-BBO employs a novel acquisition function Uncertainty-aware Exploration (UaE) to propose scores in the objective space for conditional sampling. We theoretically prove that Diffusion-BBO with UaE achieves a near-optimal solution for online BBO. We also empirically demonstrate that Diffusion-BBO with UaE outperforms existing online BBO baselines across 6 scientific discovery tasks.
Paper Structure (36 sections, 14 theorems, 70 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 36 sections, 14 theorems, 70 equations, 4 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

(Uncertainty propagation) Let $t\in[T]$ be the diffusion step, $s_\theta(\boldsymbol{x}, y , t)$ be the score function of the corresponding diffusion model $p_{\theta}(\boldsymbol{x} \mid y)$. For a single conditional diffusional model $p_{\theta}(\boldsymbol{x} \mid y)$, the uncertainty in generati where $\circ$ is the Hadamard product, and $I$ is the identity matrix. Similarly, in continuous-tim

Figures (4)

  • Figure 1: Diffusion-BBO framework using the conditional diffusion model as the inverse surrogate model. It includes $4$ stages: 1. Train the conditional diffusion model given the training dataset. 2. Compute the acquisition function and select the optimal $y^*$ to condition on. 3. Generate samples $\{\mathbf{x_0}\}$ conditioned on $y^*$. 4. Query the oracle given generated samples $\{\mathbf{x_0}\}$ and update the training dataset.
  • Figure 2: Comparison of Diffusion-BBO with baselines for online black-box optimization on DesignBench and Molecular Discovery task. All plots start at iteration 1 after one round of data queries. We plot the mean values and the confidence interval based on three random runs. Diffusion-BBO exhibits superior performance with respect to sample efficiency.
  • Figure 3: Impact of acquisition function design for black-box optimization on both discrete task (TFBind10) and continous task (D’Kitty). Comparison of Diffusion-BBO with UaE against the fixed conditioning approaches using weights $w \in \{0.6, 0.8, 1.0, 1.2, 1.4, 2.0, 2.5, 3.0\}$. Results averaged across three random runs.
  • Figure 4: Ablation study to evaluate the effect of batch size on the superconductor task. The mean and standard deviation across three random seeds are plotted. Diffusion-BBO shows robust performances across different batch size given the same total number of evaluations.

Theorems & Definitions (22)

  • Theorem 1
  • Proposition 1: Uncertainty Decomposition
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma B.1
  • proof : Proof of \ref{['lem:bound_exp_l2norm']}
  • Lemma B.2
  • proof : Proof of \ref{['lem:bound_var_l2norm']}
  • Theorem 4
  • ...and 12 more