Counterfactual Credit Guided Bayesian Optimization
Qiyu Wei, Haowei Wang, Richard Allmendinger, Mauricio A. Álvarez
TL;DR
The paper tackles the inefficiency of conventional Bayesian optimization in quickly locating the global optimum by introducing Counterfactual Credit Guided Bayesian Optimization (CCGBO). CCGBO assigns per-sample credits via counterfactual reasoning grounded in the GP posterior, computing a Monte Carlo proxy $Z_t$ for the current optimum and deriving credits $c_i$ that reflect each observation's downstream impact. These credits are propagated to continuous candidates and used to modulate a credit-weighted acquisition function, yielding $\mathrm{UCB}_{\text{credit}}(x)$ with a controllable influence that decays over time. The authors prove that the method preserves sublinear regret, demonstrate robustness without requiring priors, and empirically show accelerated convergence and improved simple regret across synthetic and real-world benchmarks, while remaining compatible with multiple acquisition strategies.
Abstract
Bayesian optimization has emerged as a prominent methodology for optimizing expensive black-box functions by leveraging Gaussian process surrogates, which focus on capturing the global characteristics of the objective function. However, in numerous practical scenarios, the primary objective is not to construct an exhaustive global surrogate, but rather to quickly pinpoint the global optimum. Due to the aleatoric nature of the sequential optimization problem and its dependence on the quality of the surrogate model and the initial design, it is restrictive to assume that all observed samples contribute equally to the discovery of the optimum in this context. In this paper, we introduce Counterfactual Credit Guided Bayesian Optimization (CCGBO), a novel framework that explicitly quantifies the contribution of individual historical observations through counterfactual credit. By incorporating counterfactual credit into the acquisition function, our approach can selectively allocate resources in areas where optimal solutions are most likely to occur. We prove that CCGBO retains sublinear regret. Empirical evaluations on various synthetic and real-world benchmarks demonstrate that CCGBO consistently reduces simple regret and accelerates convergence to the global optimum.
