Table of Contents
Fetching ...

Pareto Front Shape-Agnostic Pareto Set Learning in Multi-Objective Optimization

Rongguang Ye, Longcan Chen, Wei-Bin Kou, Jinyuan Zhang, Hisao Ishibuchi

TL;DR

This work tackles the limitation of preference-based Pareto Set Learning (PSL) which assumes knowledge of Pareto-front shape. It reframes PSL as a distribution transformation problem, introducing Pareto front shape-agnostic PSL (GPSL) that learns the Pareto set by transforming an arbitrary input distribution π0 into the Pareto-set distribution π1 and maximizing the hypervolume Hv with respect to a reference point. The method uses an R2-based hypervolume approximation to keep computation tractable and explores two sampling schemes, GPSL-G (Gaussian) and GPSL-L (Latin hypercube). Empirical results on synthetic and real-world problems show that GPSL-G and GPSL-L outperform existing PSL methods, particularly on irregular fronts, demonstrating robustness and faster convergence without front-shape priors.

Abstract

Pareto set learning (PSL) is an emerging approach for acquiring the complete Pareto set of a multi-objective optimization problem. Existing methods primarily rely on the mapping of preference vectors in the objective space to Pareto optimal solutions in the decision space. However, the sampling of preference vectors theoretically requires prior knowledge of the Pareto front shape to ensure high performance of the PSL methods. Designing a sampling strategy of preference vectors is difficult since the Pareto front shape cannot be known in advance. To make Pareto set learning work effectively in any Pareto front shape, we propose a Pareto front shape-agnostic Pareto Set Learning (GPSL) that does not require the prior information about the Pareto front. The fundamental concept behind GPSL is to treat the learning of the Pareto set as a distribution transformation problem. Specifically, GPSL can transform an arbitrary distribution into the Pareto set distribution. We demonstrate that training a neural network by maximizing hypervolume enables the process of distribution transformation. Our proposed method can handle any shape of the Pareto front and learn the Pareto set without requiring prior knowledge. Experimental results show the high performance of our proposed method on diverse test problems compared with recent Pareto set learning algorithms.

Pareto Front Shape-Agnostic Pareto Set Learning in Multi-Objective Optimization

TL;DR

This work tackles the limitation of preference-based Pareto Set Learning (PSL) which assumes knowledge of Pareto-front shape. It reframes PSL as a distribution transformation problem, introducing Pareto front shape-agnostic PSL (GPSL) that learns the Pareto set by transforming an arbitrary input distribution π0 into the Pareto-set distribution π1 and maximizing the hypervolume Hv with respect to a reference point. The method uses an R2-based hypervolume approximation to keep computation tractable and explores two sampling schemes, GPSL-G (Gaussian) and GPSL-L (Latin hypercube). Empirical results on synthetic and real-world problems show that GPSL-G and GPSL-L outperform existing PSL methods, particularly on irregular fronts, demonstrating robustness and faster convergence without front-shape priors.

Abstract

Pareto set learning (PSL) is an emerging approach for acquiring the complete Pareto set of a multi-objective optimization problem. Existing methods primarily rely on the mapping of preference vectors in the objective space to Pareto optimal solutions in the decision space. However, the sampling of preference vectors theoretically requires prior knowledge of the Pareto front shape to ensure high performance of the PSL methods. Designing a sampling strategy of preference vectors is difficult since the Pareto front shape cannot be known in advance. To make Pareto set learning work effectively in any Pareto front shape, we propose a Pareto front shape-agnostic Pareto Set Learning (GPSL) that does not require the prior information about the Pareto front. The fundamental concept behind GPSL is to treat the learning of the Pareto set as a distribution transformation problem. Specifically, GPSL can transform an arbitrary distribution into the Pareto set distribution. We demonstrate that training a neural network by maximizing hypervolume enables the process of distribution transformation. Our proposed method can handle any shape of the Pareto front and learn the Pareto set without requiring prior knowledge. Experimental results show the high performance of our proposed method on diverse test problems compared with recent Pareto set learning algorithms.
Paper Structure (15 sections, 2 theorems, 14 equations, 7 figures)

This paper contains 15 sections, 2 theorems, 14 equations, 7 figures.

Key Result

Lemma 1

A solution $\bm{x} \in \mathcal{X}$ is weakly Pareto optimal if and only if there is a preference vector $\bm{p}>0$ such that $\bm{x}$ is an optimal solution of the minimization problem of the weighted Tchebycheff function.

Figures (7)

  • Figure 1: Overview of Preference-based PSL methods. Orange lines are contours of weighted Techebycheff loss function (in Section \ref{['PPSL']}). The optimal solution obtained by Preference-based PSL methods is the intersecting point of Pareto front and given preference vector.
  • Figure 2: Relation between the sampling distribution and the Pareto front shape. Fig. \ref{['motiv']}(a) shows that the preference sampling space is a triangular shape. Preference-based PSL methods work well when Pareto front shape (Fig. \ref{['motiv']}(b)) is triangular while do not match the Pareto front in Fig. \ref{['motiv']}(c). The proposed method does not require prior knowledge of the Pareto front shape and can approximate the Pareto set from any distribution, e.g., the 3D Gaussian distribution in Fig. \ref{['motiv']}(d).
  • Figure 3: The learning goal of Pareto front shape-agnostic Pareto set learning (GPSL). GPSL transforms an initial distribution (e.g., Gaussian distribution) to the Pareto set distribution. During training, the model $\phi_{\bm{\theta}}$ tries to find the mapping from each sampling point in the initial distribution to the Pareto optimal solution.
  • Figure 4: The process of model training. The yellow area is the exact hypervolume value. The sum of the lengths of the line segments is used to approximate the exact hypervolume of the yellow area.
  • Figure 5: The log HV difference on 12 different problems. The solid line and shaded area are the mean and standard deviation of 11 independent runs of each algorithm, respectively.
  • ...and 2 more figures

Theorems & Definitions (8)

  • Definition 1: Pareto Domination
  • Definition 2: Pareto optimal solution
  • Definition 3: Pareto Set and Pareto Front
  • Lemma 1
  • Definition 4: Distribution Transformation
  • Definition 5: Hypervolume
  • Proposition 2: (hypervolume maximum $\Longrightarrow$ Pareto Set
  • Definition 6: R2-based hypervolume approximation shang2018new