Consecutive Preferential Bayesian Optimization
Aras Erarslan, Carlos Sevilla Salcedo, Ville Tanskanen, Anni Nisov, Eero Päiväkumpu, Heikki Aisala, Kaisu Honkapää, Arto Klami, Petrus Mikkola
TL;DR
Consecutive Preferential Bayesian Optimization (CPBO) extends preferential Bayesian optimization to settings where producing and evaluating candidates incur costs, and where human feedback may be indeterminate due to perceptual limits. CPBO models preferences with a Random Utility Model that incorporates a learnable Just-Noticeable Difference (JND) and uses an MES-based acquisition adapted to consecutive, reference-based comparisons, all on a GP surrogate with variational inference. The method explicitly analyzes production versus evaluation costs, demonstrates improved performance under cost-balanced regimes, and shows robustness to indifference in both synthetic benchmarks and a real-world high-moisture extrusion optimization task. This approach enables efficient, human-in-the-loop optimization in practical settings where sample production is expensive and subtle differences are hard to discern, with broad applicability to R&D and product design tasks.
Abstract
Preferential Bayesian optimization allows optimization of objectives that are either expensive or difficult to measure directly, by relying on a minimal number of comparative evaluations done by a human expert. Generating candidate solutions for evaluation is also often expensive, but this cost is ignored by existing methods. We generalize preference-based optimization to explicitly account for production and evaluation costs with Consecutive Preferential Bayesian Optimization, reducing production cost by constraining comparisons to involve previously generated candidates. We also account for the perceptual ambiguity of the oracle providing the feedback by incorporating a Just-Noticeable Difference threshold into a probabilistic preference model to capture indifference to small utility differences. We adapt an information-theoretic acquisition strategy to this setting, selecting new configurations that are most informative about the unknown optimum under a preference model accounting for the perceptual ambiguity. We empirically demonstrate a notable increase in accuracy in setups with high production costs or with indifference feedback.
