Preference-Optimized Pareto Set Learning for Blackbox Optimization

Zhang Haishan; Diptesh Das; Koji Tsuda

Preference-Optimized Pareto Set Learning for Blackbox Optimization

Zhang Haishan, Diptesh Das, Koji Tsuda

TL;DR

This work addresses the challenge of learning the full Pareto set/font for blackbox multi-objective optimization by introducing PO-PSL, a bilevel framework that jointly learns a Pareto-set mapping and optimizes the input preference vectors to uniformly cover the Pareto front. It leverages a differentiable optimization layer (DCEM) to estimate the gradient through inner optimization and employs a penalty-enhanced loss to encourage diverse front coverage. Empirical results on synthetic benchmarks (e.g., ZDT3, DTLZ5) and a real-world rocket injector problem show that PO-PSL achieves faster convergence, better Pareto-front approximation, and greater sampling efficiency than state-of-the-art PSL methods while enabling flexible, real-time exploration of the Pareto font. The approach has practical impact for design tasks where user preferences may vary and full front exploration provides valuable insight, particularly in expensive BO settings where analytic objective forms are unavailable.

Abstract

Multi-Objective Optimization (MOO) is an important problem in real-world applications. However, for a non-trivial problem, no single solution exists that can optimize all the objectives simultaneously. In a typical MOO problem, the goal is to find a set of optimum solutions (Pareto set) that trades off the preferences among objectives. Scalarization in MOO is a well-established method for finding a finite set approximation of the whole Pareto set (PS). However, in real-world experimental design scenarios, it's beneficial to obtain the whole PS for flexible exploration of the design space. Recently Pareto set learning (PSL) has been introduced to approximate the whole PS. PSL involves creating a manifold representing the Pareto front of a multi-objective optimization problem. A naive approach includes finding discrete points on the Pareto front through randomly generated preference vectors and connecting them by regression. However, this approach is computationally expensive and leads to a poor PS approximation. We propose to optimize the preference points to be distributed evenly on the Pareto front. Our formulation leads to a bilevel optimization problem that can be solved by e.g. differentiable cross-entropy methods. We demonstrated the efficacy of our method for complex and difficult black-box MOO problems using both synthetic and real-world benchmark data.

Preference-Optimized Pareto Set Learning for Blackbox Optimization

TL;DR

Abstract

Paper Structure (31 sections, 2 theorems, 21 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 31 sections, 2 theorems, 21 equations, 8 figures, 1 table, 1 algorithm.

Introduction
Code.
Problem Statement
Blackbox optimization (BO).
Multi-objective optimization (MOO).
Pareto estimation
Learning-based MOO.
Advantages of learning-based MOO.
Scalarization.
Pareto set learning (PSL).
Limitations of existing PSL methods.
Optimization-based modeling.
Proposed Method
Preference-optimized Pareto set learning (PO-PSL).
Zeroth-order gradient estimation.
...and 16 more sections

Key Result

Theorem 1

Define $w^\ast$ as in eq:w_estimate and $\Omega(\theta) = \Psi(w^\ast (\theta))$. Then, when all the derivatives exist,

Figures (8)

Figure 1: An intuitive illustration of the proposed PO-PSL. A uniformly distributed set of preference vectors (left) and the corresponding set model parameters (right) are dynamically selected in alteration. The dots on the orthogonal axes represent the coordinates of a set of reference points.
Figure 2: An intuitive illustration of the penalty term used in our loss function. This constructs a cone with $\mathbf{f}(w^\ast)$ as the vertex and a half-apex angle of $\frac{\pi}{4}$. We aim for the $w$ in the neighborhood to fall within this cone as much as possible.
Figure 3: RE5 (continuous PF). First row: PF approximation using different methods. Second row: Sampling efficiency using HVD and IGD. We select 10 random samples as the initial set, and add 5 new samples at each iteration.
Figure 4: ZDT3 (irregular PF). First row: PF approximation using different methods. Second row: Sampling efficiency using HVD and IGD. We select 20 random samples as the initial set, and add 10 new samples at each iteration.
Figure 5: DTLZ5 (degenerate PF). First row: PF approximation using different methods. Second row: Sampling efficiency using HVD and IGD. We select 20 random samples as the initial set, and add 10 new samples at each iteration.
...and 3 more figures

Theorems & Definitions (6)

Definition 1: Pareto Dominance
Definition 2: Pareto Optimality
Definition 3: Pareto Set and Pareto Font
Theorem 1
proof
Theorem 2

Preference-Optimized Pareto Set Learning for Blackbox Optimization

TL;DR

Abstract

Preference-Optimized Pareto Set Learning for Blackbox Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (6)