Table of Contents
Fetching ...

Reinforcement Learning-based Self-adaptive Differential Evolution through Automated Landscape Feature Learning

Hongshu Guo, Sijie Ma, Zechuan Huang, Yuzhi Hu, Zeyuan Ma, Xinglin Zhang, Yue-Jiao Gong

TL;DR

This paper tackles automatic, generalizable optimization for black-box problems by modeling differential evolution (DE) parameterization as an RL-driven algorithm configuration task. It introduces RLDE-AFL, which jointly learns landscape representations via NeurELA and a reinforcement learning policy to select DE operators (14 mutations and 3 crossovers) and per-operator parameters, using mantissa-exponent encoding for fitness values and time-aware state features. Trained with PPO on the MetaBox benchmark, RLDE-AFL achieves state-of-the-art results on 10D synthetic problems and shows strong zero-shot generalization to 20D, expensive budgets, and out-of-distribution protein-docking tasks, outperforming traditional DE variants and prior RL-based baselines. The work demonstrates the value of co-training a learnable feature extractor with a DAC policy to unleash behavior diversity and improve robustness across diverse optimization landscapes.

Abstract

Recently, Meta-Black-Box-Optimization (MetaBBO) methods significantly enhance the performance of traditional black-box optimizers through meta-learning flexible and generalizable meta-level policies that excel in dynamic algorithm configuration (DAC) tasks within the low-level optimization, reducing the expertise required to adapt optimizers for novel optimization tasks. Though promising, existing MetaBBO methods heavily rely on human-crafted feature extraction approach to secure learning effectiveness. To address this issue, this paper introduces a novel MetaBBO method that supports automated feature learning during the meta-learning process, termed as RLDE-AFL, which integrates a learnable feature extraction module into a reinforcement learning-based DE method to learn both the feature encoding and meta-level policy. Specifically, we design an attention-based neural network with mantissa-exponent based embedding to transform the solution populations and corresponding objective values during the low-level optimization into expressive landscape features. We further incorporate a comprehensive algorithm configuration space including diverse DE operators into a reinforcement learning-aided DAC paradigm to unleash the behavior diversity and performance of the proposed RLDE-AFL. Extensive benchmark results show that co-training the proposed feature learning module and DAC policy contributes to the superior optimization performance of RLDE-AFL to several advanced DE methods and recent MetaBBO baselines over both synthetic and realistic BBO scenarios. The source codes of RLDE-AFL are available at https://github.com/GMC-DRL/RLDE-AFL.

Reinforcement Learning-based Self-adaptive Differential Evolution through Automated Landscape Feature Learning

TL;DR

This paper tackles automatic, generalizable optimization for black-box problems by modeling differential evolution (DE) parameterization as an RL-driven algorithm configuration task. It introduces RLDE-AFL, which jointly learns landscape representations via NeurELA and a reinforcement learning policy to select DE operators (14 mutations and 3 crossovers) and per-operator parameters, using mantissa-exponent encoding for fitness values and time-aware state features. Trained with PPO on the MetaBox benchmark, RLDE-AFL achieves state-of-the-art results on 10D synthetic problems and shows strong zero-shot generalization to 20D, expensive budgets, and out-of-distribution protein-docking tasks, outperforming traditional DE variants and prior RL-based baselines. The work demonstrates the value of co-training a learnable feature extractor with a DAC policy to unleash behavior diversity and improve robustness across diverse optimization landscapes.

Abstract

Recently, Meta-Black-Box-Optimization (MetaBBO) methods significantly enhance the performance of traditional black-box optimizers through meta-learning flexible and generalizable meta-level policies that excel in dynamic algorithm configuration (DAC) tasks within the low-level optimization, reducing the expertise required to adapt optimizers for novel optimization tasks. Though promising, existing MetaBBO methods heavily rely on human-crafted feature extraction approach to secure learning effectiveness. To address this issue, this paper introduces a novel MetaBBO method that supports automated feature learning during the meta-learning process, termed as RLDE-AFL, which integrates a learnable feature extraction module into a reinforcement learning-based DE method to learn both the feature encoding and meta-level policy. Specifically, we design an attention-based neural network with mantissa-exponent based embedding to transform the solution populations and corresponding objective values during the low-level optimization into expressive landscape features. We further incorporate a comprehensive algorithm configuration space including diverse DE operators into a reinforcement learning-aided DAC paradigm to unleash the behavior diversity and performance of the proposed RLDE-AFL. Extensive benchmark results show that co-training the proposed feature learning module and DAC policy contributes to the superior optimization performance of RLDE-AFL to several advanced DE methods and recent MetaBBO baselines over both synthetic and realistic BBO scenarios. The source codes of RLDE-AFL are available at https://github.com/GMC-DRL/RLDE-AFL.

Paper Structure

This paper contains 33 sections, 9 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: The overview of the bi-level structure in RLDE-AFL.
  • Figure 2: Illustration of the network structure.
  • Figure 3: The optimization curves of RLDE-AFL and baselines on the four problems with 2,000 function evaluations.
  • Figure 4: The AEI scores of RLDE-AFL and the baselines on the protein-docking realistic problem set.
  • Figure 5: Fitness landscapes of functions in the training set when dimension is set to 2.
  • ...and 1 more figures