Table of Contents
Fetching ...

Robust Bayesian Optimization via Localized Online Conformal Prediction

Dongwon Kim, Matteo Zecchin, Sangwoo Park, Joonhyuk Kang, Osvaldo Simeone

TL;DR

Robust Bayesian optimization under GP misspecification is addressed by LOCBO, which calibrates the GP likelihood online using localized online conformal prediction and denoises to produce a calibrated posterior on the objective. LOCBO extends prior CP-based BO methods by localizing calibration across input space and providing guarantees with minimal noise assumptions. Theoretical results show a long-run calibration of LOCBO's predictions and a corresponding utility guarantee for the optimization iterates. Empirical results on synthetic benchmarks and a radio-resource-management task show LOCBO consistently outperforms state-of-the-art CP-based BO methods, especially under misspecification and heteroscedastic noise, with localization further improving performance. The approach offers a robust, scalable framework for Bayesian optimization in settings with model mismatch and region-specific uncertainty.

Abstract

Bayesian optimization (BO) is a sequential approach for optimizing black-box objective functions using zeroth-order noisy observations. In BO, Gaussian processes (GPs) are employed as probabilistic surrogate models to estimate the objective function based on past observations, guiding the selection of future queries to maximize utility. However, the performance of BO heavily relies on the quality of these probabilistic estimates, which can deteriorate significantly under model misspecification. To address this issue, we introduce localized online conformal prediction-based Bayesian optimization (LOCBO), a BO algorithm that calibrates the GP model through localized online conformal prediction (CP). LOCBO corrects the GP likelihood based on predictive sets produced by LOCBO, and the corrected GP likelihood is then denoised to obtain a calibrated posterior distribution on the objective function. The likelihood calibration step leverages an input-dependent calibration threshold to tailor coverage guarantees to different regions of the input space. Under minimal noise assumptions, we provide theoretical performance guarantees for LOCBO's iterates that hold for the unobserved objective function. These theoretical findings are validated through experiments on synthetic and real-world optimization tasks, demonstrating that LOCBO consistently outperforms state-of-the-art BO algorithms in the presence of model misspecification.

Robust Bayesian Optimization via Localized Online Conformal Prediction

TL;DR

Robust Bayesian optimization under GP misspecification is addressed by LOCBO, which calibrates the GP likelihood online using localized online conformal prediction and denoises to produce a calibrated posterior on the objective. LOCBO extends prior CP-based BO methods by localizing calibration across input space and providing guarantees with minimal noise assumptions. Theoretical results show a long-run calibration of LOCBO's predictions and a corresponding utility guarantee for the optimization iterates. Empirical results on synthetic benchmarks and a radio-resource-management task show LOCBO consistently outperforms state-of-the-art CP-based BO methods, especially under misspecification and heteroscedastic noise, with localization further improving performance. The approach offers a robust, scalable framework for Bayesian optimization in settings with model mismatch and region-specific uncertainty.

Abstract

Bayesian optimization (BO) is a sequential approach for optimizing black-box objective functions using zeroth-order noisy observations. In BO, Gaussian processes (GPs) are employed as probabilistic surrogate models to estimate the objective function based on past observations, guiding the selection of future queries to maximize utility. However, the performance of BO heavily relies on the quality of these probabilistic estimates, which can deteriorate significantly under model misspecification. To address this issue, we introduce localized online conformal prediction-based Bayesian optimization (LOCBO), a BO algorithm that calibrates the GP model through localized online conformal prediction (CP). LOCBO corrects the GP likelihood based on predictive sets produced by LOCBO, and the corrected GP likelihood is then denoised to obtain a calibrated posterior distribution on the objective function. The likelihood calibration step leverages an input-dependent calibration threshold to tailor coverage guarantees to different regions of the input space. Under minimal noise assumptions, we provide theoretical performance guarantees for LOCBO's iterates that hold for the unobserved objective function. These theoretical findings are validated through experiments on synthetic and real-world optimization tasks, demonstrating that LOCBO consistently outperforms state-of-the-art BO algorithms in the presence of model misspecification.

Paper Structure

This paper contains 34 sections, 4 theorems, 58 equations, 8 figures, 1 algorithm.

Key Result

Lemma 1

Fix a user-defined target miscoverage level $\alpha\in[0,1]$. Under Assumption ass:lower_bound_k, for any hyperparameter $\lambda>0$ and any learning rate sequence $\eta_t=\eta_1 t^{-1/2}< 1/\lambda$ with $\eta_1>0$, given any query-observation sequence $\{(\mathbf{x}_t,y_t)\}^T_{t=1}$ with bounded with $\beta=\frac{2}{\eta_1 }+\frac{4 \sqrt{\rho{\color{black}\kappa D}}}{\eta_1\lambda}+2(2\kappa+

Figures (8)

  • Figure 1: (Top) Offline CP-based Bayesian Optimization (CBO) stanton2023bayesian leverages past queries to calibrate the likelihood for the noisy observations $y$, which is then "denoised" to obtain a calibrated posterior for the objective function $f(\mathbf{x})$. (Middle) OCBO deshpande2024online uses online CP to directly calibrate the posterior distribution of the objective function $f(\mathbf{x})$ based on the feedback of past queries, while providing performance guarantees in the presence of noiseless observations. (Bottom) The proposed localized online CP-based Bayesian Optimization (LOCBO) uses localized online CP zecchin2024localized to calibrate locally the likelihood function, which is then "denoised" to compute a calibrated posterior distribution over the objective function $f(\mathbf{x})$. LOCBO provides performance guarantees irrespective of prior misspecification and under minimal assumptions on the observation noise.
  • Figure 2: Comparison between state-of-the-art calibration-based BO schemes CBO stanton2023bayesian, OCBO deshpande2024online, and the proposed method LOCBO.
  • Figure 3: (top) Calibrated likelihood in CBO, which assumes a flat density within the prediction set $\Gamma_{\alpha, t+1}^{\text{\tiny{CBO}}}(\mathbf{x}|\mathcal{D}_t)$ returned by WCP, (bottom) Calibrated likelihood in LOCBO, which assumes a flat density within the prediction set $\Gamma^{\text{LOCBO}}_{t+1}(\mathbf{x}|\mathcal{D}_{t})$ obtained by localized online CP.
  • Figure 4: Prediction intervals obtained at the end of the optimization process for LOCBO with (a) $\kappa = 0$ and (b) $\kappa = 2$ on a synthetic function (black line). With a larger localization, i.e., for a larger $\kappa$, LOCBO can focus the reduction of uncertainty on the portions of the optimization domain that are closest to optimal values.
  • Figure 5: (a) Simple regret \ref{['eq:simple_regret']} as a function of the optimization horizon $T$ for the Ackley 2D function without observation noise. (b) Simple regret \ref{['eq:simple_regret']} as a function of the optimization horizon $T$ for the Ackley 2D function with observation noise. All figures show mean and 70% confidence interval. BO, CBO, OCBO, LOCBO ($l=\infty$), OCBO-L, and LOCBO correspond to green, yellow, blue dashed, blue solid, red dashed, and red solid line, respectively.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Lemma 1: Long-run coverage of the noisy observation zecchin2024localized
  • Lemma 2: Long-run coverage of the objective function $f(\cdot)$
  • Theorem 1
  • Lemma 3: LOCBO's posterior distribution $p_{\alpha}^{\text{\tiny{LOCBO}}} (f(\mathbf{x})|\mathcal{D}_t)$