Looping in the Human Collaborative and Explainable Bayesian Optimization

Masaki Adachi; Brady Planden; David A. Howey; Michael A. Osborne; Sebastian Orbell; Natalia Ares; Krikamol Muandet; Siu Lun Chau

Looping in the Human Collaborative and Explainable Bayesian Optimization

Masaki Adachi, Brady Planden, David A. Howey, Michael A. Osborne, Sebastian Orbell, Natalia Ares, Krikamol Muandet, Siu Lun Chau

TL;DR

CoExBO addresses the challenge of trustworthy, human-centric Bayesian optimization by combining preference learning with explainability. It avoids requiring a fixed, explicit human knowledge model by learning from pairwise preferences and communicates its reasoning through Shapley-based explanations, while preserving a no-harm guarantee that ensures convergence to vanilla BO as data accumulate. The method leverages a product of Gaussian processes to fuse the objective surrogate with a learned human belief, and decays the human contribution over time to maintain robust performance. Empirical results in lithium-ion battery design and synthetic benchmarks show accelerated convergence and improved robustness when human feedback is informative, with explainability further enhancing user trust and decision quality.

Abstract

Like many optimizers, Bayesian optimization often falls short of gaining user trust due to opacity. While attempts have been made to develop human-centric optimizers, they typically assume user knowledge is well-specified and error-free, employing users mainly as supervisors of the optimization process. We relax these assumptions and propose a more balanced human-AI partnership with our Collaborative and Explainable Bayesian Optimization (CoExBO) framework. Instead of explicitly requiring a user to provide a knowledge model, CoExBO employs preference learning to seamlessly integrate human insights into the optimization, resulting in algorithmic suggestions that resonate with user preference. CoExBO explains its candidate selection every iteration to foster trust, empowering users with a clearer grasp of the optimization. Furthermore, CoExBO offers a no-harm guarantee, allowing users to make mistakes; even with extreme adversarial interventions, the algorithm converges asymptotically to a vanilla Bayesian optimization. We validate CoExBO's efficacy through human-AI teaming experiments in lithium-ion battery design, highlighting substantial improvements over conventional methods. Code is available https://github.com/ma921/CoExBO.

Looping in the Human Collaborative and Explainable Bayesian Optimization

TL;DR

Abstract

Paper Structure (42 sections, 7 theorems, 42 equations, 11 figures, 1 table)

This paper contains 42 sections, 7 theorems, 42 equations, 11 figures, 1 table.

Introduction
Bayesian optimization and existing human-in-the-loop extensions
Collaborative and Explainable BO
Model human knowledge through preference learning
Candidate generation with no-harm guarantee
Explaining candidate generation through Shapley values
Experiments
Synthetic Functions with Synthetic Human Selection
Real-World Tasks with Human Experts
Discussions and Limitations
Appendix
Proof of theorem
Regret analysis of normal UCB policy
Proof of Regrets
Proof of good user belief regrets
...and 27 more sections

Key Result

Proposition 1

Given $f_t(x)\sim {\mathcal{N}}(\mu_{f_t}(x), \kappa_{f_t}(x,x))$, $\hat{\pi}_{g_t}(x)\sim {\mathcal{N}}(\mu_{g_t}(x), \sigma_{g_t}^2(x))$ and a scaling function $\rho$ that maps $\hat{\pi}_{g_t}$ to the scale of $f_t$ and $\gamma > 0$, our new acquisition function $\alpha_{f, \pi}$ takes the follow where

Figures (11)

Figure 1: In Collaborative and Explainable Bayesian Optimization (CoExBO), a human expert collaborates with BO to refine electrolyte materials. While experts excel in discerning material differences rather than identifying the best one, pairwise comparisons and explanations boost their feedback accuracy and trust. This guides the BO to produce better candidates, ensuring quicker convergence.
Figure 2: Explanation flow: Spatial relation: BO visualizes the surrogate model's predictive distribution and estimated human preference models for the two primary dimensions determined by Shapley values. Feature importance: Users' values are provided for both candidates' predictive mean, standard deviation, and acquisition function. Selection accuracy feedback: After observing the function value, a post-hoc evaluation of the correct selection probability is given.
Figure 3: Preference learning concepts: we aim to model the ordinal relationship $f(x) < f(x^\prime)$ and its inverse with GP, utilizing the dataset $D_\text{pref}^{t_0}$, represented by white dots. Soft-Copeland score is used for the proxy of true function estimate.
Figure 4: The CoExBO AF synthesizes GPs, utilizing one GP to represent the true function (red) and another to reflect user belief (blue), leading to the product GP (green). This product GP effectively assimilates the uncertainty inherent in user belief, adjusting the level of user belief integration during the acquisition process. Whereas the CoExBO AF is designed to adaptively manage the integration of uncertain user beliefs, the $\pi$BO approach tends to excessively depend on user belief, overlooking the uncertainty.
Figure 5: Convergence plot of simple regret for 5 synthetic functions with the synthetic selection accuracy ($\epsilon_\text{pref} := \mathcal{N}(0, 0.1^2)$). Lines and shaded area denote mean $\pm$ 1 standard error. CoExBO consistently outperforms all six baselines except for the Rosenbrock function. The dark red region is the global maximum.
...and 6 more figures

Theorems & Definitions (7)

Proposition 1
Theorem 2
Lemma 3
Proposition 4
Lemma 5
Lemma 6
Lemma 7

Looping in the Human Collaborative and Explainable Bayesian Optimization

TL;DR

Abstract

Looping in the Human Collaborative and Explainable Bayesian Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (7)