Table of Contents
Fetching ...

Mixture-Model Preference Learning for Many-Objective Bayesian Optimization

Manisha Dubey, Sebastiaan De Peuter, Wanrong Wang, Samuel Kaski

Abstract

Preference-based many-objective optimization faces two obstacles: an expanding space of trade-offs and heterogeneous, context-dependent human value structures. Towards this, we propose a Bayesian framework that learns a small set of latent preference archetypes rather than assuming a single fixed utility function, modelling them as components of a Dirichlet-process mixture with uncertainty over both archetypes and their weights. To query efficiently, we designing hybrid queries that target information about (i) mode identity and (ii) within-mode trade-offs. Under mild assumptions, we provide a simple regret guarantee for the resulting mixture-aware Bayesian optimization procedure. Empirically, our method outperforms standard baselines on synthetic and real-world many-objective benchmarks, and mixture-aware diagnostics reveal structure that regret alone fails to capture.

Mixture-Model Preference Learning for Many-Objective Bayesian Optimization

Abstract

Preference-based many-objective optimization faces two obstacles: an expanding space of trade-offs and heterogeneous, context-dependent human value structures. Towards this, we propose a Bayesian framework that learns a small set of latent preference archetypes rather than assuming a single fixed utility function, modelling them as components of a Dirichlet-process mixture with uncertainty over both archetypes and their weights. To query efficiently, we designing hybrid queries that target information about (i) mode identity and (ii) within-mode trade-offs. Under mild assumptions, we provide a simple regret guarantee for the resulting mixture-aware Bayesian optimization procedure. Empirically, our method outperforms standard baselines on synthetic and real-world many-objective benchmarks, and mixture-aware diagnostics reveal structure that regret alone fails to capture.

Paper Structure

This paper contains 21 sections, 1 theorem, 20 equations, 5 figures, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{X}$ be compact and let $\mathbf f = (f_1,\dots,f_L):\mathcal{X}\to\mathbb R^L$ be an $L$-objective function. Assume: Define the Chebyshev utility for minimization and the mixture utility Let $(\eta^\star, w_k^\star)$ denote the true mixture parameters and define After $T$ evaluations $x_1,\dots,x_T$, define the simple regret Then, with probability at least $1-\delta$, where $\g

Figures (5)

  • Figure 1: (Top) Mixture-aware policies reduce regret faster and achieve lower final regret than unimodal and random-scalarization baselines, with Hybrid performing best overall. This is visible from the steeper early decline and lower terminal curves of Inter and Hybrid across DTLZ2, WFG, and PET, while Clusterless and scalarization methods plateau higher with greater variability. Curves show mean simple regret over three independent runs; vertical bars denote $\pm$ one standard error.(Bottom)Hybrid most accurately and stably recovers true archetypes, showing the largest early drop and lowest final error with the narrowest band; Inter drops quickly but plateaus, Intra refines slowly with higher variance, and Clusterless remains flat. We report mean aligned L1 error per outer iteration (band = per-true min–max; lower is better). Dataset: DTLZ2, persistent context.
  • Figure 2: Mixture weight trajectory for PET production process. Clusterless (left) collapses to a dominant component, while Hybrid (right) quickly identifies and stabilizes near the correct mode proportions.
  • Figure 3: $L_1$ error trajectories per inferred component for a persistent user on DTLZ. After Hungarian alignment (lower is better), Inter queries prevent mode collapse and rapidly reduce error for at least one archetype, while Clusterless largely collapses, with only one component improving and others remaining high
  • Figure 4: Inferred preference errors under i.i.d. user (DTLZ2); intra-query refines the dominant archetype with small fluctuations from mode switches, while inter-query rapidly reduces error for the dominant cluster but learns minority modes more slowly.
  • Figure 5: Pareto coverage in objective space. Grey points-reference Pareto front; colored points - evaluated solutions; stars - final best-utility solution under the true mixture preference. Efficient trade-offs lie along the grey frontier. Hybrid achieves broader frontier coverage and identifies final solutions closest to the mixture-preferred Pareto region.

Theorems & Definitions (1)

  • Theorem 1: Simple regret decomposition