Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation
Yilin Xie, Shiqiang Zhang, Joel A. Paulson, Calvin Tsay
TL;DR
This work tackles the global optimization of Gaussian-process-based acquisition functions in Bayesian optimization by introducing PK-MIQP, which uses a piecewise-linear kernel approximation to recast acquisition-function optimization as a MIQP. Theoretical results bound the approximation error in the GP posterior mean and variance and establish regret guarantees for the resulting BO procedure. Empirically, PK-MIQP demonstrates superior or competitive performance against gradient- and sampling-based optimizers on synthetic benchmarks, constrained problems, and hyperparameter-tuning tasks, especially in scenarios with many local minima or constraints. The approach shows promise for enabling robust global optimization in BO, with potential efficiency gains via additive GP and more advanced MIP techniques in future work.
Abstract
Bayesian optimization relies on iteratively constructing and optimizing an acquisition function. The latter turns out to be a challenging, non-convex optimization problem itself. Despite the relative importance of this step, most algorithms employ sampling- or gradient-based methods, which do not provably converge to global optima. This work investigates mixed-integer programming (MIP) as a paradigm for global acquisition function optimization. Specifically, our Piecewise-linear Kernel Mixed Integer Quadratic Programming (PK-MIQP) formulation introduces a piecewise-linear approximation for Gaussian process kernels and admits a corresponding MIQP representation for acquisition functions. The proposed method is applicable to uncertainty-based acquisition functions for any stationary or dot-product kernel. We analyze the theoretical regret bounds of the proposed approximation, and empirically demonstrate the framework on synthetic functions, constrained benchmarks, and a hyperparameter tuning task.
