A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

Tran Anh Tuan; Nguyen Viet Dung; Tran Ngoc Thang

A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

Tran Anh Tuan, Nguyen Viet Dung, Tran Ngoc Thang

TL;DR

A hyper-transformer (Hyper-Trans) model for CPFL with SFC is developed, using the theory of universal approximation for the sequence-to-sequence function to show that the Hyper-Trans model makes MED errors smaller in computational experiments than the Hyper-MLP model.

Abstract

Controllable Pareto front learning (CPFL) approximates the Pareto solution set and then locates a Pareto optimal solution with respect to a given reference vector. However, decision-maker objectives were limited to a constraint region in practice, so instead of training on the entire decision space, we only trained on the constraint region. Controllable Pareto front learning with Split Feasibility Constraints (SFC) is a way to find the best Pareto solutions to a split multi-objective optimization problem that meets certain constraints. In the previous study, CPFL used a Hypernetwork model comprising multi-layer perceptron (Hyper-MLP) blocks. With the substantial advancement of transformer architecture in deep learning, transformers can outperform other architectures in various tasks. Therefore, we have developed a hyper-transformer (Hyper-Trans) model for CPFL with SFC. We use the theory of universal approximation for the sequence-to-sequence function to show that the Hyper-Trans model makes MED errors smaller in computational experiments than the Hyper-MLP model.

A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

TL;DR

Abstract

Paper Structure (29 sections, 11 theorems, 49 equations, 10 figures, 8 tables, 2 algorithms)

This paper contains 29 sections, 11 theorems, 49 equations, 10 figures, 8 tables, 2 algorithms.

Introduction
Preliminaries
Multi-objective Optimization problem with Split Feasibility Constraints
Split Multi-objective Optimization Problem
Optimizing over the solution set of Problem \ref{['SMOP']}
Controllable Pareto Front Learning with Split Feasibility Constraints
Hypernetwork-Based Multilayer Perceptron
Hypernetwork-Based Transformer block
Solution Constraint layer
Learning Disconnected Pareto Front with Hyper-Transformer network
Hyper-Transformer with Joint Input
Hyper-Transformer with Mixture of Experts
Computational experiments
Evaluation metrics
Synthesis experiments
...and 14 more sections

Key Result

Proposition 2.1

$\mathbf{x}^*$ is Pareto optimal solution to Problem MOP$\Leftrightarrow \mathbf{x}^*$ is Pareto stationary point.

Figures (10)

Figure 1: Left: Pareto Front Learning by Hypernetwork, which is used to approximate the entire Pareto front, including non-dominated solutions. Middle: Controllable Pareto Front Learning with Completed Scalarization Function uses a single Hypernetwork model, mapping any given preference vector to its corresponding solution on the Pareto front; these solutions may not be unique. Right: Controllable Disconnected Pareto Front Learning with Split Feasibility Constraints by a Robust Hypernetwork helps avoid non-dominated solutions.
Figure 2: Hyper-MLP (left) receives an input reference vector, Hyper-Trans (right) receives each coordinate of the input reference vector and outputs the corresponding Pareto optimal solution.
Figure 3: Proposed Transformer-based Hypernetwork. Left: The Joint Input model takes reference vectors and objective function's lower bounds corresponding to each Pareto front component. Right: Mixture of Experts integrated model which inputs reference vectors.
Figure 4: Comparison of multi-objective trajectories between Hyper-Trans and Hyper-MLP. The top panel shows the evolution of $\mathcal{F}(\mathbf{x})$ obtained by Hyper-Trans with $\mathbf{r}= [0.5,0.5]$ in 2 objectives and $\mathbf{r}= [0.4,0.3,0.3]$ in 3 objectives at \ref{['CVX1']}, \ref{['CVX2']}, \ref{['CVX3']}, \ref{['ZDT1']}, \ref{['ZDT2']}, \ref{['DTLZ2']} problems (from left to right). The bottom panel shows the evolution of $\mathcal{F}(\mathbf{x})$ obtained by Hyper-MLP.
Figure 5: Left: Pareto Front is approximated by the Joint Input model. Right: Pareto Front is approximated by the Mixture of Experts model in example \ref{['ZDT3']} (top), example \ref{['ZDT3_variant']} (middle), and example \ref{['DTLZ7']} (bottom).
...and 5 more figures

Theorems & Definitions (24)

Definition 2.1: Dominance
Definition 2.2: Pareto optimal solution
Definition 2.3: Weakly Pareto optimal solution
Definition 2.4: Pareto stationary
Definition 2.5: Pareto set and Pareto front
Proposition 2.1
Definition 2.6
Definition 2.7
Proposition 3.1
proof
...and 14 more

A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

TL;DR

Abstract

A Hyper-Transformer model for Controllable Pareto Front Learning with Split Feasibility Constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (24)