Multi-Task Multi-Behavior MAP-Elites

Anne; Mouret

Multi-Task Multi-Behavior MAP-Elites

Anne, Mouret

TL;DR

The paper addresses the challenge of acquiring a diverse set of high-quality reflexes for a family of humanoid fault-recovery tasks. It introduces MTMB-MAP-Elites, which fuses MAP-Elites and Multi-Task MAP-Elites to share solutions across similar tasks and maximize the number of diverse solutions per task, formalized as maximizing $\sum_{i=1}^{n} m_i$ under the constraint that $fitness(T_i,c_i^j)=f_{max}$ and $\mathcal{F}(T_i,c_i^j) \neq \mathcal{F}(T_i,c_i^k)$ for $j \neq k$. The method advances the archive by cross-task crossover among elites, evaluating on random tasks, and updating the map when new behaviors or better fitness are found. Empirical results on a Talos humanoid fault-recovery suite show MTMB-MAP-Elites outperforms Random Search, Grid Search, and Task-Wise MAP-Elites in both solved-task rate and average solutions per solved task, demonstrating the value of cross-task sharing and behavior-space diversity for robust control policies. The work provides a pathway to dataset-driven policy learning for robust wall-contact strategies and highlights practical considerations for simulation-based robotics experimentation and cross-task optimization.

Abstract

We propose Multi-Task Multi-Behavior MAP-Elites, a variant of MAP-Elites that finds a large number of high-quality solutions for a large set of tasks (optimization problems from a given family). It combines the original MAP-Elites for the search for diversity and Multi-Task MAP-Elites for leveraging similarity between tasks. It performs better than three baselines on a humanoid fault-recovery set of tasks, solving more tasks and finding twice as many solutions per solved task.

Multi-Task Multi-Behavior MAP-Elites

TL;DR

under the constraint that

and

for

. The method advances the archive by cross-task crossover among elites, evaluating on random tasks, and updating the map when new behaviors or better fitness are found. Empirical results on a Talos humanoid fault-recovery suite show MTMB-MAP-Elites outperforms Random Search, Grid Search, and Task-Wise MAP-Elites in both solved-task rate and average solutions per solved task, demonstrating the value of cross-task sharing and behavior-space diversity for robust control policies. The work provides a pathway to dataset-driven policy learning for robust wall-contact strategies and highlights practical considerations for simulation-based robotics experimentation and cross-task optimization.

Abstract

Paper Structure (11 sections, 1 equation, 1 figure)

This paper contains 11 sections, 1 equation, 1 figure.

Introduction
Problem formulation
Algorithm
Experiment
Task Space
Command Space
Behavior Space
Fitness Function
Evaluation
Result
Conclusion

Figures (1)

Figure 1: Comparison for solving 200 tasks (100 situations with either the right-hand or with both hands) between MTMB-MAP-Elites and three baselines: a random search, a grid search, and MAP-Elites on each task individually. (a) The percentage of tasks solved and (b) the number of solutions per solved task. The line represents the median, the darker shaded area the [25%, 75%] quantiles of the data, and the lighter shaded area [5%, 95%] quantiles on 25 replications. MTMB-MAP-Elites solves more tasks than the best baselines ($+20.8\%$ than Random Search, $+9.9\%$ than Grid-Search, and $+20.7\%$ than Task-Wise MAP-Elites) and more importantly finds more than two times as many solutions per solved task.

Multi-Task Multi-Behavior MAP-Elites

TL;DR

Abstract

Multi-Task Multi-Behavior MAP-Elites

Authors

TL;DR

Abstract

Table of Contents

Figures (1)