Table of Contents
Fetching ...

Consecutive Preferential Bayesian Optimization

Aras Erarslan, Carlos Sevilla Salcedo, Ville Tanskanen, Anni Nisov, Eero Päiväkumpu, Heikki Aisala, Kaisu Honkapää, Arto Klami, Petrus Mikkola

TL;DR

Consecutive Preferential Bayesian Optimization (CPBO) extends preferential Bayesian optimization to settings where producing and evaluating candidates incur costs, and where human feedback may be indeterminate due to perceptual limits. CPBO models preferences with a Random Utility Model that incorporates a learnable Just-Noticeable Difference (JND) and uses an MES-based acquisition adapted to consecutive, reference-based comparisons, all on a GP surrogate with variational inference. The method explicitly analyzes production versus evaluation costs, demonstrates improved performance under cost-balanced regimes, and shows robustness to indifference in both synthetic benchmarks and a real-world high-moisture extrusion optimization task. This approach enables efficient, human-in-the-loop optimization in practical settings where sample production is expensive and subtle differences are hard to discern, with broad applicability to R&D and product design tasks.

Abstract

Preferential Bayesian optimization allows optimization of objectives that are either expensive or difficult to measure directly, by relying on a minimal number of comparative evaluations done by a human expert. Generating candidate solutions for evaluation is also often expensive, but this cost is ignored by existing methods. We generalize preference-based optimization to explicitly account for production and evaluation costs with Consecutive Preferential Bayesian Optimization, reducing production cost by constraining comparisons to involve previously generated candidates. We also account for the perceptual ambiguity of the oracle providing the feedback by incorporating a Just-Noticeable Difference threshold into a probabilistic preference model to capture indifference to small utility differences. We adapt an information-theoretic acquisition strategy to this setting, selecting new configurations that are most informative about the unknown optimum under a preference model accounting for the perceptual ambiguity. We empirically demonstrate a notable increase in accuracy in setups with high production costs or with indifference feedback.

Consecutive Preferential Bayesian Optimization

TL;DR

Consecutive Preferential Bayesian Optimization (CPBO) extends preferential Bayesian optimization to settings where producing and evaluating candidates incur costs, and where human feedback may be indeterminate due to perceptual limits. CPBO models preferences with a Random Utility Model that incorporates a learnable Just-Noticeable Difference (JND) and uses an MES-based acquisition adapted to consecutive, reference-based comparisons, all on a GP surrogate with variational inference. The method explicitly analyzes production versus evaluation costs, demonstrates improved performance under cost-balanced regimes, and shows robustness to indifference in both synthetic benchmarks and a real-world high-moisture extrusion optimization task. This approach enables efficient, human-in-the-loop optimization in practical settings where sample production is expensive and subtle differences are hard to discern, with broad applicability to R&D and product design tasks.

Abstract

Preferential Bayesian optimization allows optimization of objectives that are either expensive or difficult to measure directly, by relying on a minimal number of comparative evaluations done by a human expert. Generating candidate solutions for evaluation is also often expensive, but this cost is ignored by existing methods. We generalize preference-based optimization to explicitly account for production and evaluation costs with Consecutive Preferential Bayesian Optimization, reducing production cost by constraining comparisons to involve previously generated candidates. We also account for the perceptual ambiguity of the oracle providing the feedback by incorporating a Just-Noticeable Difference threshold into a probabilistic preference model to capture indifference to small utility differences. We adapt an information-theoretic acquisition strategy to this setting, selecting new configurations that are most informative about the unknown optimum under a preference model accounting for the perceptual ambiguity. We empirically demonstrate a notable increase in accuracy in setups with high production costs or with indifference feedback.

Paper Structure

This paper contains 55 sections, 15 equations, 11 figures, 12 tables.

Figures (11)

  • Figure 1: Consecutive Preferential BO (CPBO) operates with consecutive comparisons (A), in contrast to standard PBO where two (or more) new candidates are proposed at each iteration (B). For each comparison we model how likely the expert cannot tell the candidates apart; (C) shows this probability as a heatmap when candidate is compared to candidate “3”, with the contour marking the Just-Noticeable Difference (JND) threshold. Under non-zero production costs and JND, CPBO improves both optimization speed (D) and the quality of the learned utility proxy (E) over standard PBO, here EUBO by lin2022eubo.
  • Figure 2: Left three panels: Convergence of PBO variants on Branin across three cost-balance scenarios. Right: The advantage of Consecutive over Standard under a fixed budget ($B=100$) for different cost ratios. Solid line shows the mean and shaded regions the standard error of the mean.
  • Figure 3: Inference regret on Branin and Levy13 at iteration 30 for varying degrees of JND $\gamma_{\text{true}}$. Solid line shows the mean and shaded regions the standard error of the mean.
  • Figure 4: (A): High-moisture extruder, producing a candidate $y$ based on current configuration $x$, for sensory comparison against the previous candidate. (B): Inference regret gap for three operators optimizing the configuration. (C): 2D slice of the operator's (Oracle 2) learned utility: points on or inside the red contour are predicted to be indifferent to the posterior maximizer ("x").
  • Figure S1: Convergence of PBO variants on Branin, for three cost-balance scenarios.
  • ...and 6 more figures