qPOTS: Efficient batch multiobjective Bayesian optimization via Pareto optimal Thompson sampling
Ashwin Renganathan, Kade E. Carlson
TL;DR
qPOTS tackles expensive multiobjective optimization by replacing costly acquisition-function optimization with Pareto-optimal Thompson sampling over GP posteriors for $K$ objectives, over $oldsymbol{x}\in\mathcal{X}\subset\mathbb{R}^d$. It samples posterior paths $Y_i(\cdot)$ and performs a cheap multiobjective optimization on each path (via NSGA-II) to obtain a Pareto set $X^*$, from which batch points are selected by a maximin distance rule, providing an automatic exploration–exploitation balance. The method extends to constrained problems by introducing constraint GPs and feasibility indicators, and scales with Nyström approximations to reduce covariance costs to $\mathcal{O}(m^3+Nm^2)$. Theoretical guarantees show asymptotic consistency of the GP posteriors and the converged Pareto frontier, and extensive experiments on synthetic benchmarks and real-world problems demonstrate superior sample efficiency and batch performance compared to state-of-the-art MOBO methods.
Abstract
Classical evolutionary approaches for multiobjective optimization are quite accurate but incur a lot of queries to the objectives; this can be prohibitive when objectives are expensive oracles. A sample-efficient approach to solving multiobjective optimization is via Gaussian process (GP) surrogates and Bayesian optimization (BO). Multiobjective Bayesian optimization (MOBO) involves the construction of an acquisition function which is optimized to acquire new observation candidates sequentially. This ``inner'' optimization can be hard due to various reasons: acquisition functions being nonconvex, nondifferentiable and/or unavailable in analytical form; batch sampling usually exacerbates these problems and the success of MOBO heavily relies on this inner optimization. This, ultimately, affects their sample efficiency. To overcome these challenges, we propose a Thompson sampling (TS) based approach ($q\texttt{POTS}$). Whereas TS chooses candidates according to the probability that they are optimal, $q\texttt{POTS}$ chooses candidates according to the probability that they are Pareto optimal. Instead of a hard acquisition function optimization, $q\texttt{POTS}~$ solves a cheap multiobjective optimization on the GP posteriors with evolutionary approaches. This way we get the best of both worlds: accuracy of evolutionary approaches and sample-efficiency of MOBO. New candidates are chosen on the posterior GP Pareto frontier according to a maximin distance criterion. $q\texttt{POTS}~$ is endowed with theoretical guarantees, a natural exploration-exploitation trade-off, and superior empirical performance.
