Table of Contents
Fetching ...

Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization

Natalie Maus, Kyurae Kim, Yimeng Zeng, Haydn Thomas Jones, Fangping Wan, Marcelo Der Torossian Torres, Cesar de la Fuente-Nunez, Jacob R. Gardner

TL;DR

This work introduces MOCOBO, a Bayesian optimization framework for coverage optimization that seeks a small set of $K<T$ solutions capable of collectively optimizing $T$ objectives. It defines a coverage score and develops a novel acquisition, Expected Coverage Improvement (ECI), along with a greedy‑based approximation to form the best observed covering set, extended to batch (q‑ECI) and trust‑region (TuRBO‑M) settings. The method is validated on high‑dimensional structured tasks including peptide and molecule design, rover control, and HDR image tone mapping, showing that MOCOBO’s covering sets achieve comparable objective coverage to $T$ independently optimized solutions and yield practically meaningful results, such as potent antimicrobial peptides in vitro. The work provides open source code and demonstrates MOCOBO’s potential for accelerating complex design problems, while acknowledging practical considerations like structured input spaces requiring pre‑trained generative models and significant compute demands.

Abstract

In multi-objective black-box optimization, the goal is typically to find solutions that optimize a set of $T$ black-box objective functions, $f_1, \ldots f_T$, simultaneously. Traditional approaches often seek a single Pareto-optimal set that balances trade-offs among all objectives. In contrast, we consider a problem setting that departs from this paradigm: finding a small set of $K < T$ solutions, that collectively "cover" the $T$ objectives. A set of solutions is defined as "covering" if, for each objective $f_1, \ldots f_T$, there is at least one good solution. A motivating example for this problem setting occurs in drug design. For example, we may have $T$ pathogens and aim to identify a set of $K < T$ antibiotics such that at least one antibiotic can be used to treat each pathogen. This problem, known as coverage optimization, has yet to be tackled with the Bayesian optimization (BO) framework. To fill this void, we develop Multi-Objective Coverage Bayesian Optimization (MOCOBO), a BO algorithm for solving coverage optimization. Our approach is based on a new acquisition function reminiscent of expected improvement in the vanilla BO setup. We demonstrate the performance of our method on high-dimensional black-box optimization tasks, including applications in peptide and molecular design. Results show that the coverage of the $K < T$ solutions found by MOCOBO matches or nearly matches the coverage of $T$ solutions obtained by optimizing each objective individually. Furthermore, in in vitro experiments, the peptides found by MOCOBO exhibited high potency against drug-resistant pathogens, further demonstrating the potential of MOCOBO for drug discovery. All of our code is publicly available at the following link: https://github.com/nataliemaus/mocobo.

Covering Multiple Objectives with a Small Set of Solutions Using Bayesian Optimization

TL;DR

This work introduces MOCOBO, a Bayesian optimization framework for coverage optimization that seeks a small set of solutions capable of collectively optimizing objectives. It defines a coverage score and develops a novel acquisition, Expected Coverage Improvement (ECI), along with a greedy‑based approximation to form the best observed covering set, extended to batch (q‑ECI) and trust‑region (TuRBO‑M) settings. The method is validated on high‑dimensional structured tasks including peptide and molecule design, rover control, and HDR image tone mapping, showing that MOCOBO’s covering sets achieve comparable objective coverage to independently optimized solutions and yield practically meaningful results, such as potent antimicrobial peptides in vitro. The work provides open source code and demonstrates MOCOBO’s potential for accelerating complex design problems, while acknowledging practical considerations like structured input spaces requiring pre‑trained generative models and significant compute demands.

Abstract

In multi-objective black-box optimization, the goal is typically to find solutions that optimize a set of black-box objective functions, , simultaneously. Traditional approaches often seek a single Pareto-optimal set that balances trade-offs among all objectives. In contrast, we consider a problem setting that departs from this paradigm: finding a small set of solutions, that collectively "cover" the objectives. A set of solutions is defined as "covering" if, for each objective , there is at least one good solution. A motivating example for this problem setting occurs in drug design. For example, we may have pathogens and aim to identify a set of antibiotics such that at least one antibiotic can be used to treat each pathogen. This problem, known as coverage optimization, has yet to be tackled with the Bayesian optimization (BO) framework. To fill this void, we develop Multi-Objective Coverage Bayesian Optimization (MOCOBO), a BO algorithm for solving coverage optimization. Our approach is based on a new acquisition function reminiscent of expected improvement in the vanilla BO setup. We demonstrate the performance of our method on high-dimensional black-box optimization tasks, including applications in peptide and molecular design. Results show that the coverage of the solutions found by MOCOBO matches or nearly matches the coverage of solutions obtained by optimizing each objective individually. Furthermore, in in vitro experiments, the peptides found by MOCOBO exhibited high potency against drug-resistant pathogens, further demonstrating the potential of MOCOBO for drug discovery. All of our code is publicly available at the following link: https://github.com/nataliemaus/mocobo.

Paper Structure

This paper contains 78 sections, 7 theorems, 23 equations, 16 figures, 9 tables, 1 algorithm.

Key Result

Lemma 3.1

Let $T, K$ be finite positive integers such that $K < T$. Let $f_1, \ldots, f_T$ be real valued functions. Let $D_s = \left\{ (\mathbf{x}_1, \mathbf{y}_1), \ldots, (\mathbf{x}_n, \mathbf{y}_n) \right\}$ be a dataset of $n$ real valued data points such that for all $\mathbf{x}_i$, $\mathbf{y}_i = (f

Figures (16)

  • Figure 1: Traditional multi-objective optimization for $T$ objectives might select any point along the Pareto frontier, but in some situations like this any Pareto optimal point performs poorly on at least one objective. In situations where multiple $K < T$ solutions are allowed ($\blacksquare$), we can sometimes optimize all objectives well. Note that this is a simplified schematic meant to illustrate intuition.
  • Figure 1: Greedy $(1 - \frac{1}{e})$-Approximation for Finding $S^{*}_{D_s}$ (Incremental Strategy)
  • Figure 2: Coverage optimization results on all tasks considered.
  • Figure 3: In vitro results for the two best "template free" (TF1, TF2) and two best "template constrained" (TC1, TC2) runs of MOCOBO for the peptide design task. Columns are the best/lowest in vitro MIC among the $K=4$ peptides found by MOCOBO for each target pathogenic bacteria B1$, \ldots,$ B11 listed in \ref{['tab:bacteria']}. (-) and (+) indicate Gram negative and Gram positive respectively. TF1 and TC1 correspond to the single runs of MOCOBO shown in \ref{['tab:template-free-result']} and \ref{['tab:template-constrained-result']} respectively. Methods used to obtain in vitro MICs are provided in \ref{['sec:marcelo-lab-methods']}.
  • Figure 4: Ablation study comparing MOCOBO to optimization performance where a known "good" partitioning of the $T$ objectives into $K$ subsets is available in advance. We individually optimize $K$ solutions, one for each partition.
  • ...and 11 more figures

Theorems & Definitions (20)

  • Lemma 3.1: NP-hardness of Optimal Covering Set
  • proof
  • Theorem 3.2
  • proof
  • Corollary 3.3: \ref{['alg:greedy-simple']} is the best possible approximation of $S^{*}_{D_s}$
  • proof
  • proof
  • Definition I.1: Maximum Coverage Problem (MCP)
  • Proposition I.2: MCP is NP-Hard
  • Definition J.1: Coverage Score
  • ...and 10 more