Towards Multi-Morphology Controllers with Diversity and Knowledge Distillation
Alican Mertan, Nick Cheney
TL;DR
This work tackles the problem of learning controllers that generalize across many morphologies. It pairs Quality Diversity domain exploration (MAP-Elites) to build a repertoire of high-performing single-morphology controllers with a knowledge-distillation step that trains a single multi-morphology controller to mimic their input-output patterns. The distilled controller demonstrates near-teacher performance across diverse morphologies, zero-shot generalization to unseen bodies, and serves as an efficient prior for rapid finetuning on new morphologies or tasks. The approach is architecture-agnostic, scalable with the number of teachers, and synergistic with ongoing advances in brain-body co-optimization and modular controller design, offering a practical pathway toward foundational multi-morphology controllers.
Abstract
Finding controllers that perform well across multiple morphologies is an important milestone for large-scale robotics, in line with recent advances via foundation models in other areas of machine learning. However, the challenges of learning a single controller to control multiple morphologies make the `one robot one task' paradigm dominant in the field. To alleviate these challenges, we present a pipeline that: (1) leverages Quality Diversity algorithms like MAP-Elites to create a dataset of many single-task/single-morphology teacher controllers, then (2) distills those diverse controllers into a single multi-morphology controller that performs well across many different body plans by mimicking the sensory-action patterns of the teacher controllers via supervised learning. The distilled controller scales well with the number of teachers/morphologies and shows emergent properties. It generalizes to unseen morphologies in a zero-shot manner, providing robustness to morphological perturbations and instant damage recovery. Lastly, the distilled controller is also independent of the teacher controllers -- we can distill the teacher's knowledge into any controller model, making our approach synergistic with architectural improvements and existing training algorithms for teacher controllers.
