Table of Contents
Fetching ...

Enhancing Generalization and Scalability for Multi-Objective Optimization with Population Pre-Training

Haokai Hong, Liang Feng, Min Jiang, Kay Chen Tan

TL;DR

A population transformer architecture that embeds decision spaces of varying scales into a common latent space, enabling knowledge transfer across diverse problems, and integrates objective-space features through objective fusion to enhance population prediction accuracy for complex MOPs is developed.

Abstract

Multi-objective optimization problems (MOPs) require the simultaneous optimization of conflicting objectives. Real-world MOPs often exhibit complex characteristics, including high-dimensional decision spaces, many objectives, or computationally expensive evaluations. While population-based evolutionary computation has shown promise in addressing diverse MOPs through problem-specific adaptations, existing approaches frequently lack generalizability across distinct problem classes. Inspired by pre-training paradigms in machine learning, we propose a Population Pre-trained Model (PPM) that leverages historical optimization knowledge to solve complex MOPs within a unified framework efficiently. PPM models evolutionary patterns via population modeling, addressing two key challenges: (1) handling diverse decision spaces across problems and (2) capturing the interdependency between objective and decision spaces during evolution. To this end, we develop a population transformer architecture that embeds decision spaces of varying scales into a common latent space, enabling knowledge transfer across diverse problems. Furthermore, our architecture integrates objective-space features through objective fusion to enhance population prediction accuracy for complex MOPs. Our approach achieves robust generalization to downstream optimization tasks with up to 5,000 dimensions--five times the training scale and 200 times greater than prior work. Extensive evaluations on standardized benchmarks and out-of-training real-world applications demonstrate the consistent superiority of our method over state-of-the-art algorithms tailored to specific problem classes, improving the performance and generalization of evolutionary computation in solving MOPs.

Enhancing Generalization and Scalability for Multi-Objective Optimization with Population Pre-Training

TL;DR

A population transformer architecture that embeds decision spaces of varying scales into a common latent space, enabling knowledge transfer across diverse problems, and integrates objective-space features through objective fusion to enhance population prediction accuracy for complex MOPs is developed.

Abstract

Multi-objective optimization problems (MOPs) require the simultaneous optimization of conflicting objectives. Real-world MOPs often exhibit complex characteristics, including high-dimensional decision spaces, many objectives, or computationally expensive evaluations. While population-based evolutionary computation has shown promise in addressing diverse MOPs through problem-specific adaptations, existing approaches frequently lack generalizability across distinct problem classes. Inspired by pre-training paradigms in machine learning, we propose a Population Pre-trained Model (PPM) that leverages historical optimization knowledge to solve complex MOPs within a unified framework efficiently. PPM models evolutionary patterns via population modeling, addressing two key challenges: (1) handling diverse decision spaces across problems and (2) capturing the interdependency between objective and decision spaces during evolution. To this end, we develop a population transformer architecture that embeds decision spaces of varying scales into a common latent space, enabling knowledge transfer across diverse problems. Furthermore, our architecture integrates objective-space features through objective fusion to enhance population prediction accuracy for complex MOPs. Our approach achieves robust generalization to downstream optimization tasks with up to 5,000 dimensions--five times the training scale and 200 times greater than prior work. Extensive evaluations on standardized benchmarks and out-of-training real-world applications demonstrate the consistent superiority of our method over state-of-the-art algorithms tailored to specific problem classes, improving the performance and generalization of evolutionary computation in solving MOPs.
Paper Structure (46 sections, 5 equations, 13 figures, 10 tables, 1 algorithm)

This paper contains 46 sections, 5 equations, 13 figures, 10 tables, 1 algorithm.

Figures (13)

  • Figure 1: Illustration of the proposed population pre-trained model. Utilize PPM to solve complex MOPs (Initialization - Evaluation - Reproducing - Selection - Fine-tuning). PPM could be regarded as the replacement of the process of generating a new population, which generally contains SBX, mutation, particle update, etc.
  • Figure 2: Illustration of the proposed population pre-trained model: dataset, pre-training, and design rationale. (a) The pre-training dataset is constructed from solution pairs generated by existing MOEAs across diverse MOPs. (b) The training objective of the PPM is to predict a next-generation population exhibiting better convergence and diversity, using the current population as input. (c) The figure illustrates the self-attention mechanism inherent in PPM. It stimulates the attention outcomes during the generation of the solution $\boldsymbol{x}^{g+1}_{1}$. The color of the circle and line corresponds to the magnitude of the attention score, with darker shades indicating larger scores.
  • Figure 3: (a) The framework of the proposed Population Transformer, an instantiation of PPM, operates as follows: 1. Collect the population $\mathbf{X}^{g} = {\boldsymbol{x}^{g}}$ at generation $g$. 2. Embed the decision variable dimensions of $\boldsymbol{x}^g$ using Eq. (\ref{['equ: dim-embed']}) to obtain $\boldsymbol{d}_0$. 3. Fuse objective values into $\boldsymbol{d}_0$ via Eq. (\ref{['equ: obj-embed']}) and (\ref{['equ: input']}) to obtain $\mathbf{Z}_{0}$. 4. Process $\mathbf{Z}_{0}$ using the PPM. 5. Output the next-generation population $\mathbf{X}^{g+1}={ \boldsymbol{x}^{g+1} }$. (b) The architecture of the population transformer. During pre-training, $\mathbf{X}^{g+1}=\{ \boldsymbol{x}^{g+1} \}$ are target solutions but masked for training. During fine-tuning, $\mathbf{X}^{g+1}=\{ \boldsymbol{x}^{g+1} \}$ are generated solutions by PPM, and once one solution $\boldsymbol{x}^{g+1}$ is generated, $\boldsymbol{x}^{g+1}$ will be evaluated and then input to the PPM.
  • Figure 4: Visualization of Non-dominated Solutions Obtained by Each Algorithms on LSMOP7, LSMOP8, and LSMOP9. The first row of the Figures is the visualization of the complete solutions, and the second row is the scaled near PF for better visualization.
  • Figure 5: Visualization of Convergence of the Proposed PPM and Compared Algorithms
  • ...and 8 more figures