Table of Contents
Fetching ...

Discovering Quality-Diversity Algorithms via Meta-Black-Box Optimization

Maxence Faldor, Robert Tjarko Lange, Antoine Cully

TL;DR

Quality-Diversity algorithms balance exploration and exploitation but rely on handcrafted local competition rules. The authors introduce Learned Quality-Diversity (LQD), an attention-based transformer that parameterizes competition rules and is optimized via meta-black-box optimization to discover powerful QD strategies. LQD variants generalize across task families, scale to higher dimensions and larger populations, and even retain diversity when trained only for fitness, indicating diversity is an instrumental driver of high performance. These results demonstrate that meta-learning can automatically rediscover core evolutionary principles and yield robust, domain-general QD algorithms.

Abstract

Quality-Diversity has emerged as a powerful family of evolutionary algorithms that generate diverse populations of high-performing solutions by implementing local competition principles inspired by biological evolution. While these algorithms successfully foster diversity and innovation, their specific mechanisms rely on heuristics, such as grid-based competition in MAP-Elites or nearest-neighbor competition in unstructured archives. In this work, we propose a fundamentally different approach: using meta-learning to automatically discover novel Quality-Diversity algorithms. By parameterizing the competition rules using attention-based neural architectures, we evolve new algorithms that capture complex relationships between individuals in the descriptor space. Our discovered algorithms demonstrate competitive or superior performance compared to established Quality-Diversity baselines while exhibiting strong generalization to higher dimensions, larger populations, and out-of-distribution domains like robot control. Notably, even when optimized solely for fitness, these algorithms naturally maintain diverse populations, suggesting meta-learning rediscovers that diversity is fundamental to effective optimization.

Discovering Quality-Diversity Algorithms via Meta-Black-Box Optimization

TL;DR

Quality-Diversity algorithms balance exploration and exploitation but rely on handcrafted local competition rules. The authors introduce Learned Quality-Diversity (LQD), an attention-based transformer that parameterizes competition rules and is optimized via meta-black-box optimization to discover powerful QD strategies. LQD variants generalize across task families, scale to higher dimensions and larger populations, and even retain diversity when trained only for fitness, indicating diversity is an instrumental driver of high performance. These results demonstrate that meta-learning can automatically rediscover core evolutionary principles and yield robust, domain-general QD algorithms.

Abstract

Quality-Diversity has emerged as a powerful family of evolutionary algorithms that generate diverse populations of high-performing solutions by implementing local competition principles inspired by biological evolution. While these algorithms successfully foster diversity and innovation, their specific mechanisms rely on heuristics, such as grid-based competition in MAP-Elites or nearest-neighbor competition in unstructured archives. In this work, we propose a fundamentally different approach: using meta-learning to automatically discover novel Quality-Diversity algorithms. By parameterizing the competition rules using attention-based neural architectures, we evolve new algorithms that capture complex relationships between individuals in the descriptor space. Our discovered algorithms demonstrate competitive or superior performance compared to established Quality-Diversity baselines while exhibiting strong generalization to higher dimensions, larger populations, and out-of-distribution domains like robot control. Notably, even when optimized solely for fitness, these algorithms naturally maintain diverse populations, suggesting meta-learning rediscovers that diversity is fundamental to effective optimization.

Paper Structure

This paper contains 33 sections, 1 equation, 10 figures, 4 tables, 4 algorithms.

Figures (10)

  • Figure 1: Learned competition function architecture.
  • Figure 2: Quality-Diversity trade-off across algorithms. Each point represents the median fitness and novelty scores achieved across 32 runs on BBO training tasks. LQD variants demonstrate the flexibility of our framework, successfully specializing for different objectives while matching or exceeding baseline performance.
  • Figure 3: Performance comparison across out-of-distribution BBOB tasks for the three objectives. Results are normalized relative to random GA baseline (dashed line at y=1). Bars show median values across 32 runs, with error bars indicating interquartile range.
  • Figure 4: Generalization analysis across population sizes and search space dimensions. Heatmaps show the performance difference between LQD and ME (blue indicates LQD advantage, red indicates ME advantage) for two BBOB tasks.
  • Figure 5: Fitness across robot control tasks. Lines show mean performance across 32 independent runs, with shaded regions indicating 95% confidence intervals. LQD demonstrates strong generalization, matching or exceeding baseline performance despite never being trained on robotic tasks.
  • ...and 5 more figures