BOtied: Multi-objective Bayesian optimization with tied multivariate ranks

Ji Won Park; Nataša Tagasovska; Michael Maser; Stephen Ra; Kyunghyun Cho

BOtied: Multi-objective Bayesian optimization with tied multivariate ranks

Ji Won Park, Nataša Tagasovska, Michael Maser, Stephen Ra, Kyunghyun Cho

TL;DR

BOtied reframes multi-objective Bayesian optimization through the Pareto-compliant Pareto front as extreme quantiles of the joint CDF $F_Y(y)$. It introduces a CDF indicator $I_{F_Y}$ and a CDF-based acquisition, BOtied, implemented via vine copulas to efficiently estimate high-dimensional joint distributions and preserve invariance to monotonic transformations and rescalings. Empirical results on synthetic and real-world problems show that BOtied often achieves superior or competitive hypervolume and CDF-scored quality with improved scalability to many objectives, while requiring comparable or lower computational effort than HV-based methods. The approach provides a principled, transform-invariant MOBO framework with practical impact in domains with heterogeneous objective units and complex dependence structures, such as drug design and engineering optimization.

Abstract

Many scientific and industrial applications require the joint optimization of multiple, potentially competing objectives. Multi-objective Bayesian optimization (MOBO) is a sample-efficient framework for identifying Pareto-optimal solutions. At the heart of MOBO is the acquisition function, which determines the next candidate to evaluate by navigating the best compromises among the objectives. In this paper, we show a natural connection between non-dominated solutions and the extreme quantile of the joint cumulative distribution function (CDF). Motivated by this link, we propose the Pareto-compliant CDF indicator and the associated acquisition function, BOtied. BOtied inherits desirable invariance properties of the CDF, and an efficient implementation with copulas allows it to scale to many objectives. Our experiments on a variety of synthetic and real-world problems demonstrate that BOtied outperforms state-of-the-art MOBO acquisition functions while being computationally efficient for many objectives.

BOtied: Multi-objective Bayesian optimization with tied multivariate ranks

TL;DR

BOtied reframes multi-objective Bayesian optimization through the Pareto-compliant Pareto front as extreme quantiles of the joint CDF

. It introduces a CDF indicator

and a CDF-based acquisition, BOtied, implemented via vine copulas to efficiently estimate high-dimensional joint distributions and preserve invariance to monotonic transformations and rescalings. Empirical results on synthetic and real-world problems show that BOtied often achieves superior or competitive hypervolume and CDF-scored quality with improved scalability to many objectives, while requiring comparable or lower computational effort than HV-based methods. The approach provides a principled, transform-invariant MOBO framework with practical impact in domains with heterogeneous objective units and complex dependence structures, such as drug design and engineering optimization.

Abstract

Paper Structure (38 sections, 4 theorems, 19 equations, 13 figures, 4 tables, 1 algorithm)

This paper contains 38 sections, 4 theorems, 19 equations, 13 figures, 4 tables, 1 algorithm.

Introduction
Contributions
Related work
Background
Bayesian Optimization
Multi-objective optimization
Hypervolume
Noisy observations
Multi-objective BO with multivariate ranks
CDF indicator
Estimating the CDF with copulas
From copula density to CDF.
High-dimensional CDF with vine copulas
CDF-based acquisition function: BOtied
Empirical results
...and 23 more sections

Key Result

Theorem 4.3

For any pair of approximation sets $A \in \mathcal{Y}$ and $B \in \mathcal{Y}$,

Figures (13)

Figure 1: Illustration of the conceptual link between the empirical Pareto front probed by the HV indicator and innermost level line of the CDF probed by the BOtied CDF indicator. The blue set of candidates dominates the orange. The HV indicator is consistent with this ordering; the area of the box dominated by the blue set is greater. The BOtied CDF values and associated multivariate ranks also favor the blue.
Figure 2: Level lines of the CDF (left) and the PDF (right) from kernel density estimation based on 200 observations (gray dots). The zero level line of the CDF closely traces the true Pareto front (solid red curve).
Figure 3: Top: The CDF indicator is invariant to arbitrary monotonic transformations of the objectives (here transforming $f_2$ via arctan). Bottom: The HV indicator is highly sensitive to them. The color gradient corresponds to indicator value at each solution ($q=1$). Gray circles are overlaid on the five solutions with the top indicator scores. CDF chooses the same five solutions, but HV prefers ones with high $f_1$ after $f_2$ becomes squashed.
Figure 4: A recipe for estimating the CDF with copulas, in three simple steps and fewer than 5 lines of Python code. Plots are based on the Caco2+ dataset.
Figure 5: HV vs. iterations for three synthetic test functions. We show the mean and two standard errors over 20 random seeds.
...and 8 more figures

Theorems & Definitions (10)

Definition 4.1: Cumulative distribution function
Definition 4.2: CDF Indicator
Theorem 4.3: Pareto compliance
Remark 4.4
Theorem 4.5
Definition 4.6: Probability integral transform
Corollary 4.7: Scale invariance
Corollary 4.8: Invariance under monotonic transformations
proof
proof

BOtied: Multi-objective Bayesian optimization with tied multivariate ranks

TL;DR

Abstract

BOtied: Multi-objective Bayesian optimization with tied multivariate ranks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (10)