Quality-Diversity Meta-Evolution: customising behaviour spaces to a meta-objective
David M. Bossens, Danesh Tarapore
TL;DR
This work introduces Quality-Diversity Meta-Evolution (QD-Meta), a framework that evolves a population of MAP-Elites instances with differing feature-maps to optimise a user-defined meta-objective. It combines CMA-ES with a large solution database, non-linear feature-maps, and dynamic parameter control via SARSA($\lambda$) to produce archives tailored for robustness and adaptation. Empirical evaluation on dynamic Rastrigin function optimisation and hexapod locomotion shows that QD-Meta achieves faster adaptation and higher average performance than state-of-the-art baselines (AURORA and CVT-MAP-Elites), while offering qualitative insights into the archives’ structure under meta-objectives. The results demonstrate that tailoring the behaviour space to meta-objectives improves generalisation and adaptation, with potential for scaling to higher-dimensional spaces and for integrating other meta-heuristic strategies.
Abstract
Quality-Diversity (QD) algorithms evolve behaviourally diverse and high-performing solutions. To illuminate the elite solutions for a space of behaviours, QD algorithms require the definition of a suitable behaviour space. If the behaviour space is high-dimensional, a suitable dimensionality reduction technique is required to maintain a limited number of behavioural niches. While current methodologies for automated behaviour spaces focus on changing the geometry or on unsupervised learning, there remains a need for customising behavioural diversity to a particular meta-objective specified by the end-user. In the newly emerging framework of QD Meta-Evolution, or QD-Meta for short, one evolves a population of QD algorithms, each with different algorithmic and representational characteristics, to optimise the algorithms and their resulting archives to a user-defined meta-objective. Despite promising results compared to traditional QD algorithms, QD-Meta has yet to be compared to state-of-the-art behaviour space automation methods such as Centroidal Voronoi Tessellations Multi-dimensional Archive of Phenotypic Elites Algorithm (CVT-MAP-Elites) and Autonomous Robots Realising their Abilities (AURORA). This paper performs an empirical study of QD-Meta on function optimisation and multilegged robot locomotion benchmarks. Results demonstrate that QD-Meta archives provide improved average performance and faster adaptation to a priori unknown changes to the environment when compared to CVT-MAP-Elites and AURORA. A qualitative analysis shows how the resulting archives are tailored to the meta-objectives provided by the end-user.
