Table of Contents
Fetching ...

Counterfactual Credit Guided Bayesian Optimization

Qiyu Wei, Haowei Wang, Richard Allmendinger, Mauricio A. Álvarez

TL;DR

The paper tackles the inefficiency of conventional Bayesian optimization in quickly locating the global optimum by introducing Counterfactual Credit Guided Bayesian Optimization (CCGBO). CCGBO assigns per-sample credits via counterfactual reasoning grounded in the GP posterior, computing a Monte Carlo proxy $Z_t$ for the current optimum and deriving credits $c_i$ that reflect each observation's downstream impact. These credits are propagated to continuous candidates and used to modulate a credit-weighted acquisition function, yielding $\mathrm{UCB}_{\text{credit}}(x)$ with a controllable influence that decays over time. The authors prove that the method preserves sublinear regret, demonstrate robustness without requiring priors, and empirically show accelerated convergence and improved simple regret across synthetic and real-world benchmarks, while remaining compatible with multiple acquisition strategies.

Abstract

Bayesian optimization has emerged as a prominent methodology for optimizing expensive black-box functions by leveraging Gaussian process surrogates, which focus on capturing the global characteristics of the objective function. However, in numerous practical scenarios, the primary objective is not to construct an exhaustive global surrogate, but rather to quickly pinpoint the global optimum. Due to the aleatoric nature of the sequential optimization problem and its dependence on the quality of the surrogate model and the initial design, it is restrictive to assume that all observed samples contribute equally to the discovery of the optimum in this context. In this paper, we introduce Counterfactual Credit Guided Bayesian Optimization (CCGBO), a novel framework that explicitly quantifies the contribution of individual historical observations through counterfactual credit. By incorporating counterfactual credit into the acquisition function, our approach can selectively allocate resources in areas where optimal solutions are most likely to occur. We prove that CCGBO retains sublinear regret. Empirical evaluations on various synthetic and real-world benchmarks demonstrate that CCGBO consistently reduces simple regret and accelerates convergence to the global optimum.

Counterfactual Credit Guided Bayesian Optimization

TL;DR

The paper tackles the inefficiency of conventional Bayesian optimization in quickly locating the global optimum by introducing Counterfactual Credit Guided Bayesian Optimization (CCGBO). CCGBO assigns per-sample credits via counterfactual reasoning grounded in the GP posterior, computing a Monte Carlo proxy for the current optimum and deriving credits that reflect each observation's downstream impact. These credits are propagated to continuous candidates and used to modulate a credit-weighted acquisition function, yielding with a controllable influence that decays over time. The authors prove that the method preserves sublinear regret, demonstrate robustness without requiring priors, and empirically show accelerated convergence and improved simple regret across synthetic and real-world benchmarks, while remaining compatible with multiple acquisition strategies.

Abstract

Bayesian optimization has emerged as a prominent methodology for optimizing expensive black-box functions by leveraging Gaussian process surrogates, which focus on capturing the global characteristics of the objective function. However, in numerous practical scenarios, the primary objective is not to construct an exhaustive global surrogate, but rather to quickly pinpoint the global optimum. Due to the aleatoric nature of the sequential optimization problem and its dependence on the quality of the surrogate model and the initial design, it is restrictive to assume that all observed samples contribute equally to the discovery of the optimum in this context. In this paper, we introduce Counterfactual Credit Guided Bayesian Optimization (CCGBO), a novel framework that explicitly quantifies the contribution of individual historical observations through counterfactual credit. By incorporating counterfactual credit into the acquisition function, our approach can selectively allocate resources in areas where optimal solutions are most likely to occur. We prove that CCGBO retains sublinear regret. Empirical evaluations on various synthetic and real-world benchmarks demonstrate that CCGBO consistently reduces simple regret and accelerates convergence to the global optimum.

Paper Structure

This paper contains 25 sections, 4 theorems, 15 equations, 10 figures, 1 table, 1 algorithm.

Key Result

Theorem 5.1

Let $\mathcal{X}\subset\mathbb{R}^d$ be compact, and at iteration $t$ let the GP posterior be $f_t\sim\mathcal{GP}(\mu_t,k_t)$ with $\sigma_t^2(\mathbf{x})=k_t(\mathbf{x},\mathbf{x})$ and $S_t=\sup_{\mathbf{x}\in\mathcal{X}}\sigma_t(\mathbf{x})<\infty$. Draw $K$ i.i.d. posterior sample paths $\{f_t^

Figures (10)

  • Figure 1: Illustration of Credit-Weighted UCB. Our goal is to maximize the objective function. (1) From the GP posterior at iteration $t$, we draw several paths; existing observations are shown as dots and the true optimum is marked by $\star$. (2) In this illustration, for $K=3$ posterior samples, we compute each sample’s maximizer $x_t^{(j)}$ and maximum $Z_t^{(j)}$, and form the Monte Carlo proxy of the global maximum $Z_t$. (3) For every observed point $x_i$, we assign a counterfactual credit $\ell_i$, normalize and propagate it, then obtain the weight $w_t(x)$. (4) The right-hand color bar shows the counterfactual credit. By incorporating counterfactual credit into the UCB (red), our method concentrates exploitation on high‑contribution regions, yielding a next query that targets the true optimum compared to the standard UCB (green).
  • Figure 2: Cumulative regret and Simple regret versus iteration for eight benchmark functions.
  • Figure 3: Sequential optimization steps for Standard BO (left) vs. CCGBO (right). Each row shows iterations 1 through 6 on a one-dimensional toy function. Blue curves and shaded regions: GP posterior mean and $95\%$ credible interval. Black dashed curve: true objective. Green dashed line: standard UCB (left) or credit‐weighted UCB (right). Red dots: observed samples. Vertical dashed line: next evaluation location. Simple regret is reported below each subplot.
  • Figure 4: Cumulative regret and Simple regret versus iteration for 8 benchmark functions on TS. Each subplot plots cumulative regret over iterations $t$, comparing the eight baselines.
  • Figure 5: Cumulative regret and Simple regret versus iteration for 8 benchmark functions on LogEI. Each subplot plots cumulative regret over iterations $t$, comparing the eight baselines.
  • ...and 5 more figures

Theorems & Definitions (7)

  • Theorem 5.1
  • Remark 1
  • Theorem 5.2
  • Theorem B.1
  • proof
  • Theorem B.2
  • proof