Table of Contents
Fetching ...

A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

Tran Anh Tuan, Nguyen Viet Dung, Tran Ngoc Thang

TL;DR

A hyper-transformer (Hyper-Trans) model for CPFL with SFC is developed, using the theory of universal approximation for the sequence-to-sequence function to show that the Hyper-Trans model makes MED errors smaller in computational experiments than the Hyper-MLP model.

Abstract

Controllable Pareto front learning (CPFL) approximates the Pareto solution set and then locates a Pareto optimal solution with respect to a given reference vector. However, decision-maker objectives were limited to a constraint region in practice, so instead of training on the entire decision space, we only trained on the constraint region. Controllable Pareto front learning with Split Feasibility Constraints (SFC) is a way to find the best Pareto solutions to a split multi-objective optimization problem that meets certain constraints. In the previous study, CPFL used a Hypernetwork model comprising multi-layer perceptron (Hyper-MLP) blocks. With the substantial advancement of transformer architecture in deep learning, transformers can outperform other architectures in various tasks. Therefore, we have developed a hyper-transformer (Hyper-Trans) model for CPFL with SFC. We use the theory of universal approximation for the sequence-to-sequence function to show that the Hyper-Trans model makes MED errors smaller in computational experiments than the Hyper-MLP model.

A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

TL;DR

A hyper-transformer (Hyper-Trans) model for CPFL with SFC is developed, using the theory of universal approximation for the sequence-to-sequence function to show that the Hyper-Trans model makes MED errors smaller in computational experiments than the Hyper-MLP model.

Abstract

Controllable Pareto front learning (CPFL) approximates the Pareto solution set and then locates a Pareto optimal solution with respect to a given reference vector. However, decision-maker objectives were limited to a constraint region in practice, so instead of training on the entire decision space, we only trained on the constraint region. Controllable Pareto front learning with Split Feasibility Constraints (SFC) is a way to find the best Pareto solutions to a split multi-objective optimization problem that meets certain constraints. In the previous study, CPFL used a Hypernetwork model comprising multi-layer perceptron (Hyper-MLP) blocks. With the substantial advancement of transformer architecture in deep learning, transformers can outperform other architectures in various tasks. Therefore, we have developed a hyper-transformer (Hyper-Trans) model for CPFL with SFC. We use the theory of universal approximation for the sequence-to-sequence function to show that the Hyper-Trans model makes MED errors smaller in computational experiments than the Hyper-MLP model.
Paper Structure (29 sections, 11 theorems, 49 equations, 10 figures, 8 tables, 2 algorithms)

This paper contains 29 sections, 11 theorems, 49 equations, 10 figures, 8 tables, 2 algorithms.

Key Result

Proposition 2.1

$\mathbf{x}^*$ is Pareto optimal solution to Problem MOP$\Leftrightarrow \mathbf{x}^*$ is Pareto stationary point.

Figures (10)

  • Figure 1: Left: Pareto Front Learning by Hypernetwork, which is used to approximate the entire Pareto front, including non-dominated solutions. Middle: Controllable Pareto Front Learning with Completed Scalarization Function uses a single Hypernetwork model, mapping any given preference vector to its corresponding solution on the Pareto front; these solutions may not be unique. Right: Controllable Disconnected Pareto Front Learning with Split Feasibility Constraints by a Robust Hypernetwork helps avoid non-dominated solutions.
  • Figure 2: Hyper-MLP (left) receives an input reference vector, Hyper-Trans (right) receives each coordinate of the input reference vector and outputs the corresponding Pareto optimal solution.
  • Figure 3: Proposed Transformer-based Hypernetwork. Left: The Joint Input model takes reference vectors and objective function's lower bounds corresponding to each Pareto front component. Right: Mixture of Experts integrated model which inputs reference vectors.
  • Figure 4: Comparison of multi-objective trajectories between Hyper-Trans and Hyper-MLP. The top panel shows the evolution of $\mathcal{F}(\mathbf{x})$ obtained by Hyper-Trans with $\mathbf{r}= [0.5,0.5]$ in 2 objectives and $\mathbf{r}= [0.4,0.3,0.3]$ in 3 objectives at \ref{['CVX1']}, \ref{['CVX2']}, \ref{['CVX3']}, \ref{['ZDT1']}, \ref{['ZDT2']}, \ref{['DTLZ2']} problems (from left to right). The bottom panel shows the evolution of $\mathcal{F}(\mathbf{x})$ obtained by Hyper-MLP.
  • Figure 5: Left: Pareto Front is approximated by the Joint Input model. Right: Pareto Front is approximated by the Mixture of Experts model in example \ref{['ZDT3']} (top), example \ref{['ZDT3_variant']} (middle), and example \ref{['DTLZ7']} (bottom).
  • ...and 5 more figures

Theorems & Definitions (24)

  • Definition 2.1: Dominance
  • Definition 2.2: Pareto optimal solution
  • Definition 2.3: Weakly Pareto optimal solution
  • Definition 2.4: Pareto stationary
  • Definition 2.5: Pareto set and Pareto front
  • Proposition 2.1
  • Definition 2.6
  • Definition 2.7
  • Proposition 3.1
  • proof
  • ...and 14 more