Table of Contents
Fetching ...

Quality-Diversity Meta-Evolution: customising behaviour spaces to a meta-objective

David M. Bossens, Danesh Tarapore

TL;DR

This work introduces Quality-Diversity Meta-Evolution (QD-Meta), a framework that evolves a population of MAP-Elites instances with differing feature-maps to optimise a user-defined meta-objective. It combines CMA-ES with a large solution database, non-linear feature-maps, and dynamic parameter control via SARSA($\lambda$) to produce archives tailored for robustness and adaptation. Empirical evaluation on dynamic Rastrigin function optimisation and hexapod locomotion shows that QD-Meta achieves faster adaptation and higher average performance than state-of-the-art baselines (AURORA and CVT-MAP-Elites), while offering qualitative insights into the archives’ structure under meta-objectives. The results demonstrate that tailoring the behaviour space to meta-objectives improves generalisation and adaptation, with potential for scaling to higher-dimensional spaces and for integrating other meta-heuristic strategies.

Abstract

Quality-Diversity (QD) algorithms evolve behaviourally diverse and high-performing solutions. To illuminate the elite solutions for a space of behaviours, QD algorithms require the definition of a suitable behaviour space. If the behaviour space is high-dimensional, a suitable dimensionality reduction technique is required to maintain a limited number of behavioural niches. While current methodologies for automated behaviour spaces focus on changing the geometry or on unsupervised learning, there remains a need for customising behavioural diversity to a particular meta-objective specified by the end-user. In the newly emerging framework of QD Meta-Evolution, or QD-Meta for short, one evolves a population of QD algorithms, each with different algorithmic and representational characteristics, to optimise the algorithms and their resulting archives to a user-defined meta-objective. Despite promising results compared to traditional QD algorithms, QD-Meta has yet to be compared to state-of-the-art behaviour space automation methods such as Centroidal Voronoi Tessellations Multi-dimensional Archive of Phenotypic Elites Algorithm (CVT-MAP-Elites) and Autonomous Robots Realising their Abilities (AURORA). This paper performs an empirical study of QD-Meta on function optimisation and multilegged robot locomotion benchmarks. Results demonstrate that QD-Meta archives provide improved average performance and faster adaptation to a priori unknown changes to the environment when compared to CVT-MAP-Elites and AURORA. A qualitative analysis shows how the resulting archives are tailored to the meta-objectives provided by the end-user.

Quality-Diversity Meta-Evolution: customising behaviour spaces to a meta-objective

TL;DR

This work introduces Quality-Diversity Meta-Evolution (QD-Meta), a framework that evolves a population of MAP-Elites instances with differing feature-maps to optimise a user-defined meta-objective. It combines CMA-ES with a large solution database, non-linear feature-maps, and dynamic parameter control via SARSA() to produce archives tailored for robustness and adaptation. Empirical evaluation on dynamic Rastrigin function optimisation and hexapod locomotion shows that QD-Meta achieves faster adaptation and higher average performance than state-of-the-art baselines (AURORA and CVT-MAP-Elites), while offering qualitative insights into the archives’ structure under meta-objectives. The results demonstrate that tailoring the behaviour space to meta-objectives improves generalisation and adaptation, with potential for scaling to higher-dimensional spaces and for integrating other meta-heuristic strategies.

Abstract

Quality-Diversity (QD) algorithms evolve behaviourally diverse and high-performing solutions. To illuminate the elite solutions for a space of behaviours, QD algorithms require the definition of a suitable behaviour space. If the behaviour space is high-dimensional, a suitable dimensionality reduction technique is required to maintain a limited number of behavioural niches. While current methodologies for automated behaviour spaces focus on changing the geometry or on unsupervised learning, there remains a need for customising behavioural diversity to a particular meta-objective specified by the end-user. In the newly emerging framework of QD Meta-Evolution, or QD-Meta for short, one evolves a population of QD algorithms, each with different algorithmic and representational characteristics, to optimise the algorithms and their resulting archives to a user-defined meta-objective. Despite promising results compared to traditional QD algorithms, QD-Meta has yet to be compared to state-of-the-art behaviour space automation methods such as Centroidal Voronoi Tessellations Multi-dimensional Archive of Phenotypic Elites Algorithm (CVT-MAP-Elites) and Autonomous Robots Realising their Abilities (AURORA). This paper performs an empirical study of QD-Meta on function optimisation and multilegged robot locomotion benchmarks. Results demonstrate that QD-Meta archives provide improved average performance and faster adaptation to a priori unknown changes to the environment when compared to CVT-MAP-Elites and AURORA. A qualitative analysis shows how the resulting archives are tailored to the meta-objectives provided by the end-user.

Paper Structure

This paper contains 20 sections, 8 equations, 6 figures, 2 tables, 1 algorithm.

Figures (6)

  • Figure 1: Flow diagram of QD-Meta, which repeats a four-step cycle: 1. CMA-ES samples the meta-population from its distribution in the meta-genotypic space. 2. Each of the meta-genotypes $\mathbf{w}^i$ for $i \in \{1,\dots,\lambda\}$ then independently applies MAP-Elites with its own feature-map: 2a construct the feature-map $\phi(\mathbf{w}^i,\cdot)$; 2b rapidly populate the archive $\mathcal{M}^i$ with entries in the database $\mathscr{D}$. 2c perform repeated MAP-Elites iterations, selecting a genotype from the archive, mutating it, evaluating its fitness $f'$ and its base-features $\mathbf{b}'$ from the observed behaviour, computing the target-features $\beta = \phi(\mathbf{w}^i,\mathbf{b}')$, adding the solution to the database, and adding it in the archive as $\mathcal{M}[\beta])$ -- if it is an elite for the region around $\beta$. 2d the archive's meta-fitness is computed. 3. The archives evolved by the meta-genotypes are evaluated on their meta-fitness. 4. Using the meta-genotypes and their meta-fitness scores, CMA-ES updates its distribution in the meta-genotypic space to find archives with the highest meta-fitness.
  • Figure 2: The RHex hexapod robot platform: (a) the physical robot, on which the simulation is based; (b) the down-and-up stairs obstacle course; (c) the thick pipe obstacle course. For the full set of obstacle courses, see Fig. S2 of the Supplemental Materials.
  • Figure 3: Quality-diversity statistics (Mean $\pm$ SE) of the different included QD algorithms across 20 replicates on the Rastrigin function optimisation, including (a) the total number of solutions in the archive; (b) the average fitness across the archive; and (c) the maximal fitness across the archive. For QD-Meta, Mean and SE statistics are aggregated across replicates and the different archives within the meta-population.
  • Figure 4: Test performance (Mean $\pm$ SE) of the different included QD algorithms across 20 replicates of the Rastrigin function optimisation benchmark. The $y$-axis shows the best solution so far after performing random search over the behavioural archive for the number of function evaluations indicated on the $x$-axis. For each replicate of QD-Meta, the archive with the highest meta-fitness at the end of meta-evolution is chosen.
  • Figure 5: Quality-diversity statistics (Mean $\pm$ SE) of the different included QD algorithms across 4 replicates on the RHex robot platform, including (a) the total number of solutions in the archive; (b) the average fitness across the archive; and (c) the maximal fitness across the archive. For QD-Meta, Mean and SE statistics are aggregated across replicates and the different archives within the meta-population.
  • ...and 1 more figures