Table of Contents
Fetching ...

Diversity By Design: Leveraging Distribution Matching for Offline Model-Based Optimization

Michael S. Yao, James C. Gee, Osbert Bastani

TL;DR

Diversity is critical in offline model-based optimization to capture multiple high-quality design configurations. DynAMO treats diversity as distribution matching between the generator's design distribution and a temperature-weighted reference from the offline dataset, augmented with an adversarial critic constraint to bound out-of-distribution evaluation. By deriving a closed-form dual via Lagrangian optimization, DynAMO remains compatible with a wide range of optimizers and tasks, demonstrated across diverse Design-Bench problems and Molecule design. Empirically, DynAMO yields substantial gains in candidate diversity while maintaining strong quality, though trade-offs emerge with hyperparameters and certain metrics. The approach offers a scalable, task- and optimizer-agnostic path to diversify offline design exploration and supports downstream multi-objective decision-making in scientific domains.

Abstract

The goal of offline model-based optimization (MBO) is to propose new designs that maximize a reward function given only an offline dataset. However, an important desiderata is to also propose a diverse set of final candidates that capture many optimal and near-optimal design configurations. We propose Diversity in Adversarial Model-based Optimization (DynAMO) as a novel method to introduce design diversity as an explicit objective into any MBO problem. Our key insight is to formulate diversity as a distribution matching problem where the distribution of generated designs captures the inherent diversity contained within the offline dataset. Extensive experiments spanning multiple scientific domains show that DynAMO can be used with common optimization methods to significantly improve the diversity of proposed designs while still discovering high-quality candidates.

Diversity By Design: Leveraging Distribution Matching for Offline Model-Based Optimization

TL;DR

Diversity is critical in offline model-based optimization to capture multiple high-quality design configurations. DynAMO treats diversity as distribution matching between the generator's design distribution and a temperature-weighted reference from the offline dataset, augmented with an adversarial critic constraint to bound out-of-distribution evaluation. By deriving a closed-form dual via Lagrangian optimization, DynAMO remains compatible with a wide range of optimizers and tasks, demonstrated across diverse Design-Bench problems and Molecule design. Empirically, DynAMO yields substantial gains in candidate diversity while maintaining strong quality, though trade-offs emerge with hyperparameters and certain metrics. The approach offers a scalable, task- and optimizer-agnostic path to diversify offline design exploration and supports downstream multi-objective decision-making in scientific domains.

Abstract

The goal of offline model-based optimization (MBO) is to propose new designs that maximize a reward function given only an offline dataset. However, an important desiderata is to also propose a diverse set of final candidates that capture many optimal and near-optimal design configurations. We propose Diversity in Adversarial Model-based Optimization (DynAMO) as a novel method to introduce design diversity as an explicit objective into any MBO problem. Our key insight is to formulate diversity as a distribution matching problem where the distribution of generated designs captures the inherent diversity contained within the offline dataset. Extensive experiments spanning multiple scientific domains show that DynAMO can be used with common optimization methods to significantly improve the diversity of proposed designs while still discovering high-quality candidates.

Paper Structure

This paper contains 34 sections, 11 theorems, 88 equations, 7 figures, 15 tables, 1 algorithm.

Key Result

Lemma 3.1

(Diversity Collapse in Reward Optimization) Suppose that there exists a finite set of globally optimal designs $x^*_j$ such that $x^*_j:=\mathop{\mathrm{arg\,max}}\limits_{x\in\mathcal{X}} r(x)$ and $r^*:=r(x^*_j)$ is the optimal reward given a finite, non-uniform reward function $r(x)$. Given any d

Figures (7)

  • Figure 1: Overview of Diversity in Adversarial Model-based Optimization (DynAMO). Traditional model-based optimization (MBO) coms techniques can generate high-scoring designs, although often at the expense of the diversity of proposed designs. Ideally, the final set of candidates should be of high quality while capturing multiple 'modes of goodness' within the design space. For example, although there are 3 unique global maxima (stars) in the 2D Branin branin optimization problem, traditional Bayesian optimization (BO-qUCB) proposes designs clustered around only a singla optima (diamonds). In contrast, we show how DynAMO can be used to modify the MBO objective to discover diverse and high-quality designs (circles).
  • Figure A1: Sample $\tau$-Weighted Probability Distributions. We plot ($\tau=1.0$)-weighted distributions $p^\tau_\mathcal{D}(y)$ (blue) versus the original distribution of oracle scores $y$ in the public offline dataset $\mathcal{D}$ (orange) for the 6 offline optimization tasks in our experimental evaluation suite: (1) TFBind8 (top left); (2) UTR (top middle); (3) ChEMBL (top right); (4) Molecule (bottom left); (5) Superconductor (bottom middle); and (6) D'Kitty (bottom right). DynAMO penalizes a model-based optimization objective to encourage sampling policies to match the diversity of (high-scoring) designs in the $\tau$-weighted distribution. The $x$-axis represents the normalized oracle scores.
  • Figure A2: Distribution of Generated Design Quality and Diversity Scores. We plot the distributions of the (top left) oracle score; (top right) minimum novelty; and (bottom) pairwise diversity of the $k=128$ proposed designs from a single representative experimental run using the CMA-ES backbone optimizers with and without DynAMO on the TFBind8 task. Dashed blue (resp., dotted green) lines in the top panels represent the mean score achieved by the Baseline CMA-ES (resp., DynAMO-CMA-ES) method from the experimental run.
  • Figure A3: Sampling Batch Size Ablation. We vary the sampling batch size $b$ in Algorithm \ref{['algo:dynamo']} between 2 and 512, and report both the (left) Best@128 Oracle Score and (right) Pairwise Diversity score for 128 final designs proposed by a DynAMO-BO-qEI policy on the TFBind8 optimization task. We plot the mean $\pm$ 95% confidence interval over 10 random seeds.
  • Figure A4: $\beta$ Hyperparameter Ablation. We vary the value of the KL-divergence regularization strength hyperparameter $\beta$ in Algorithm \ref{['algo:dynamo']} between 0.01 and 100, and report both the (left) Best@128 Oracle Score and (right) Pairwise Diversity score for 128 final design candidates proposed by a DynAMO-BO-qEI policy on the TFBind8 optimization task. We plot the mean $\pm$ 95% confidence interval over 10 random seeds in both plots. The dotted horizontal line corresponds to the $\beta=0$ experimental mean score, which could not be plotted as a point on the logarithmic $x$-axis.
  • ...and 2 more figures

Theorems & Definitions (25)

  • Lemma 3.1
  • Definition 3.2: $\tau$-Weighted Probability Distribution
  • Lemma 3.3: Entropy-Divergence Formulation
  • Lemma 3.4: Explicit Dual Function of (\ref{['eq:constrained-opt']})
  • proof
  • proof
  • Remark 1.1: Equivalence of Lemma \ref{['lemma:entropy-divergence-formulation']} and Canonical State-Matching
  • proof
  • Definition 3.1: $f$-Divergence
  • Definition 3.2: Fenchel Conjugate
  • ...and 15 more