RLEMMO: Evolutionary Multimodal Optimization Assisted By Deep Reinforcement Learning

Hongqiao Lian; Zeyuan Ma; Hongshu Guo; Ting Huang; Yue-Jiao Gong

RLEMMO: Evolutionary Multimodal Optimization Assisted By Deep Reinforcement Learning

Hongqiao Lian, Zeyuan Ma, Hongshu Guo, Ting Huang, Yue-Jiao Gong

TL;DR

RLEMMO addresses multimodal optimization under limited evaluations by introducing a generalizable MetaBBO framework. A meta-level reinforcement learning agent, trained with PPO, flexibly assigns per-individual search strategies during lower-level evolutionary optimization, guided by a landscape-informed state representation and attention-based population sharing. A clustering-based reward encourages both solution quality and diversity, enabling effective meta-training on MMOP families and generalization to unseen problems. On the CEC2013 MMOP benchmark, RLEMMO achieves competitive performance against strong baselines and demonstrates robust generalization, supported by ablation studies that highlight the importance of state features, action diversity, and the clustering reward.

Abstract

Solving multimodal optimization problems (MMOP) requires finding all optimal solutions, which is challenging in limited function evaluations. Although existing works strike the balance of exploration and exploitation through hand-crafted adaptive strategies, they require certain expert knowledge, hence inflexible to deal with MMOP with different properties. In this paper, we propose RLEMMO, a Meta-Black-Box Optimization framework, which maintains a population of solutions and incorporates a reinforcement learning agent for flexibly adjusting individual-level searching strategies to match the up-to-date optimization status, hence boosting the search performance on MMOP. Concretely, we encode landscape properties and evolution path information into each individual and then leverage attention networks to advance population information sharing. With a novel reward mechanism that encourages both quality and diversity, RLEMMO can be effectively trained using a policy gradient algorithm. The experimental results on the CEC2013 MMOP benchmark underscore the competitive optimization performance of RLEMMO against several strong baselines.

RLEMMO: Evolutionary Multimodal Optimization Assisted By Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure (29 sections, 10 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 29 sections, 10 equations, 3 figures, 4 tables, 1 algorithm.

Introduction
Related Works
Evolutionary Multimodal Optimization
Meta-Black-Box Optimization
Preliminary
K-nearest Neighbors(KNN)
DBSCAN
Policy Gradients
Attention Mechanism
Methodology
Overview
State Representation
Action
Reward Design
Workflow of RLEMMO
...and 14 more sections

Figures (3)

Figure 1: Blueprint of RLEMMO, where the meta-level RL agent outputs a search strategy for advancing the solution population in the low-level optimization. The meta-level RL agent is meta-trained to maximize the accumulated reward during the low-level optimization.
Figure 2: The architecture of neural networks in RLEMMO is depicted, with arrows indicating the overall workflow: at each time step, we input the state representation of the current solution population into the neural networks and sample individual-level strategies to advance the population along the low-level optimization process. And the critic is used to estimate the return value for training the policy network.
Figure 3: Ablation studies: The average PR and SR of the ablation experiments on the testing dataset are compared at the accuracy level of $10^{-4}$. The results in sub-figures \ref{['fig:abla-state']} to \ref{['fig:abla-reward']} represent the ablation experiments on state features, action set, and reward mechanism, respectively.

RLEMMO: Evolutionary Multimodal Optimization Assisted By Deep Reinforcement Learning

TL;DR

Abstract

RLEMMO: Evolutionary Multimodal Optimization Assisted By Deep Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)